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ABSTRACT 

This  thesis  has  two  parts,  both  related  to  the  develop- 
ment of  smart  sensor  systems.   The  first  part  is  a  theoretical 
development  of  two  families  of  adaptive  spatial  filters  for 
suppressing  background  clutters  in  infrared  images  and  based 
on  the  minimization  of  mean-  squared  error  or  the  maximization 
of  signal  to  noise  ratio  criterion.   Seven  different  nonlinear 
search  techniques  have  been  developed  for  the  adaptation  pro- 
cess.  They  have  been  applied  to  two  real  world  infrared  test 
images  and  exhibit  fast  convergence  rate  with  no  misadjust- 
ment .   The  second  part  is  an  experimental  development  of  a 
multiple  microcomputer  system  which  can  be  a  candidate  for  an 
on-board  processor  system.   A  multiple  star,  multiple  cluster 
architecture  was  developed  whose  intercommunication  is  managed 
by  a  three  level  control  including  central  controller,  dis- 
tributed controller  and  random  priority  controller.   The 
adaptive  spatial  filter  has  been  successfully  implemented  on 
this  system  using  partitioning  for  parallel  computing. 
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I.   INTRODUCTION 

A.   OBJECTIVES 

1.   Dual  Objectives  of  this  Thesis 

This  thesis  consists  of  two  closely  related  studies. 

a.  The  first  study  is  the  theoretical  development 
of  adaptive  image  processing  algorithms  for  enhancement  of 
"target  signal"  to  "clutter  noise"  ratio  in  images.   It  will 
be  used  in  the  first  step  of  a  multiple-stage  image  process- 
ing program  for  detection  of  dim  targets  in  noisy  infrared 
images . 

b.  The  second  study  is  an  experimental  development 

of  a  multiple  microcomputer  system  for  implementation  of  these 
adaptive  image  processing  algorithms. 

These  two  studies  belong  to  two  different  technical 
areas.   Either  topic  could  be  the  subject  of  one  thesis  pro- 
ject.  However,  they  are  investigated  together  in  this  thesis 
because  of  the  special  nature  of  a  new  emerging  field  which 
inspired  the  research  undertaken  by  this  project.   This  new 
field  is  sometimes  known  as  the  "Smart  Sensors"  [1,  2,  3]. 
Its  developments  got  into  high  gear  only  in  the  late  1970's 
when  advances  in  two  integrated  circuit  fields,  VLSI  digital 
electronics  and  mosaic  optical  sensor  arrays,  were  joined 
together  to  develop  new  optical  sensors  which  also  have 
sophisticated  on-board  signal/data  processing  capabilities. 
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In  other  words,  they  are  SMART -SENSORS.   Their  importance  is 
closely  associated  with  the  coexistence  of  "sensing''  and 
"processing"  capabilities  on  a  small  volume,  light  weight, 
low  power  platform.   Therefore,  the  successful  development 
of  "smart  sensor"  systems  includes  not  only  new  signal/data 
processing  algorithms  to  provide  the  needed  "smartness"  but 
also  efficient  implementation  by  signal/data  processors  whose 
size,  weight,  power  and  performance  are  compatible  with  the 
requirements  of  on-board  equipment  in  many  practical  military 
systems . 

2 .   Multi-Dimensional  "Smart  Sensor"  Signal  Processing 

In  most  optical  smart  sensor  systems,  signals  of 
interest   are  in  the  form  of  images.   If  the  field  of  view 
of  the  sensor  platform  is  not  stabilized,  or  locked  onto  a 
target,  successive  frames  of  images  are  not  registered. 
Signal  processing  can  only  use  single  frames  of  an  image. 
Therefore,  the  signal  is  two  dimensional  in  terms  of  the 
spatial  variables  x  and  y.   If  sensors  in  several  spectral 
bands  are  available  and  well  registered  spatially  the  sig- 
nals are  three  dimensional  in  terms  of  variables,  x,  y  and  A. 

In  many  other  smart  sensor  systems,  the  field  of 
view  of  the  sensor  platform  either  does  not  change  (as  in 
a  synchronous  orbit  satellite  with  staring  sensors)  or  is 
stabilized  (as  in  aircraft  '  with  step-staring  sensors)  or 
is  locked  onto  a  target  (as  in  missiles  after  they  have  al- 
ready acquired  a  target).   In  these  cases,  successive  images 
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are  registered.   Both  single  frames  of  images  and  multiple 
frames  of  images  are  available  for  signal  processing.   The 
signal  is  then  three  dimensional  in  terms  of  x,  y  and  t.   In 
addition,  if  multi-spectral  sensors  are  registered,  the  signal 
is  four  dimensional  in  terms  of  x,  y,  t  and  A. 

Therefore,  signal  processing  operations  required  for 
smart  sensors  are  often  multi-dimensional.   This  thesis  is 
concerned  with  adaptive  spatial  filters  processing  infrared 
images.   This  type  of  spatial  filter  should  be  distinguished 
from  the  majority  of  image  processing  methods  which  are  con- 
cerned with  the  image  itself  as  the  signal  of  interest. 
Our  primary  goal  is  concentrated  in  the  targets.   The  image 
itself,  often  called  the  background  clutter,  is  considered 
as  noise  and  must  be  suppressed  so  that  dim  target  signals 
can  be  revealed  to  allow  the  application  of  a  threshold  to 
initiate  the  detection  process.   In  addition  to  the  clutter, 
the  image  may  include  other  noise  and  man-made  interference 
and  jamming  also,  which  are  all  treated  as  noise.   Only 
targets  are  considered  as  signals. 

3.   Multiple  Stages  "Smart  Sensor"  Signal  Processing 

To  accomplish  the  objectives  of  most  smart  sensor 
systems  in  detecting,  tracking  and  recognizing  very  dim 
targets  deeply  buried  in  noise,  a  multiple  stage  image  pro- 
cessing approach  is  generally  needed  (Table  1.1). 
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TABLE  1.1 
IMAGE  PROCESSING  STAGES 


Objective  in 
Various  Stages 

Processing 

Enhancement 

Pre- threshold 

Hard  Limiting 
Adaptive  Filtering 

Detection 

Threshold 

Adaptive  threshold 
Target  Acquisition 

Tracking 
Recognition 

Post- threshold 

Kalman  Tracker 
Target  Recognition 

For  more  detail,  see  Chapter  III.B.2. 

This  thesis  will  concentrate  on  the  development  of 
new  adaptive  filter  techniques  which  will  be  used  in  the 
"Enhancement"  stage  to  improve  the  "target  signal"  to 
"background  clutter  noise"  ratio  by  either  suppressing  the 
background  clutter  or  enhancing  the  target  signal,  or  both. 

B.   STATISTICAL  IMAGE  PROCESSING  TECHNIQUES  FOR 
ENHANCEMENT  OF  "TARGET  SIGNAL"  TO  "BACKGROUND 
NOISE"  RATIO  IN  INFRARED  IMAGES 

1.   Introduction 

Although  the  responsibility  of  detecting  very  dim 

targets  is  shared  by  several  steps  of  image  processing  in 

pre-threshold,  threshold  and  post- threshold  stages,  the  "en 

hancement"  step  before  thresholding  plays  a  very  important 

role  because  it  is  necessary  to  improve  the  "target  signal" 

to  "clutter  noise"  ratio  to  approximately  one   before  a 
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threshold  operation  can  be  applied.   Otherwise,  there  will 
be  too  many  false  alarms  collected  by  the  thresholding  step, 
which  makes  post- threshold  signal  processing  difficult. 
Therefore,  in  theoretical  developments  of  new  image  process- 
ing techniques  for  smart  sensors,  a  great  deal  of  attention 
is  given  to  background  clutter  suppression  techniques  for 
enhancement  of  the  signal  to  noise  ratio  before  the  threshold- 
ing step. 

We  have  made  a  survey  of  these  techniques  and  present 
them  in  several  classifications  in  Table  1.2.   First,  they 
are  classified  as  nonadaptive,  open  loop  adaptive  and  closed 
loop  adaptive.   By  "nonadaptive,"  we  refer  to  those  approaches 
whose  filters  are  not  designed  by  using  the  image  character- 
istics.  However,  in  two  adaptive  cases,  the  filters  are 
tailor-designed  based  on  the  characteristic  learned  from  the 
images  being  processed.   In  the  open  loop  adaptive  case,  the 
filter  is  not  able  to  update  or  correct  itself  when  the  char- 
acteristics of  the  image  are  changed.   The  image  properties 
must  be  "relearned"  before  a  redesign  of  the  filter  can  be 
made.   In  the  closed  loop  adaptive  case,  a  feedback  process 
is  provided  between  the  filter  output  and  the  input  to  the 
design  process.   In  this  way,  any  change  in  the  image  char- 
acteristics will  result  in  an  increase  of  the  output  error 
which  is  used  to  automatically  update  and  correct  the  filter 
design. 
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TABLE   1.  2 
FOCAL    PLANE    PROCESSING   TECHNIQUES    FOR    BACKGROUND   CLUTTER    SUPPRESSION 


FOCAL  PLANE  PROCESSING  ALGORITHMS 

ACTIVE  GROUPS 

NONAOAPTIVE 

DETERMINISTIC 

SPATIAL 

1st  order,  2nd  order  (Laplacian) 
4th  order  nonrecursive  spatial 
filter 

MIT  Lincoln* 
Laboratory 

TEMPORAL 

Frame  to  frame  differencing: 
(Nonrecursive  temporal  filter) 

j  1st  and  2nd  differencing 

'  3rd  differencing 

Grumman 
Rockwel 1 

Hughes 

SPATIAL- 
TEMPORAL 

Three  dimensional  spatial -temporal 
filter  by  variational  method 

Rockwell 

Pseudo-reticle  nonrecursive  spatial 
filter  followed  by  recursive  tempo- 
ral bandpass  or  highpass  filter 

Optical 
Science 

SPATIAL- 
SPECTRAL 

Nonrecursive  spatial  filter  followed 
by  two  color  discrimination 

MIT  Lincoln* 
Laooratory 

OPEN  LOOP 
ADAPTIVE 

DETERMINISTIC 

SPATIAL 

Background  normalization 
(Localized  adaotiv?  threshold) 

General* 
Electric 

TEMPORAL 

3andpass  filter  followed  by  adaptive 
threshold 

Aerojet  * 
ElectroSystems 

2nd,  3rd  order  recursive  temporal 
highpass  filter 

Rockwel 1 

STATISTICAL 

SPATIAL 

Minimization  of  mean  square  error: 
i  Recursive  Kalman  filter  (spatial) 

Grumman. 

NPGS 

'  Nonrecursive  Wiener  filter(soatial J 

Lockheed 

NPGS 

Maximization  of  signal  to  noise  ratio: 
Nonrecursive  spatial  match  filter 

MIT  Lincoln* 
Laboratory 

NPGS 

TEMPORAL 

Maximization  of  Likelihood  ratio 

Aerospace  Corp 

Minimization  of  mean  square  error: 
J  Nonrecursive  temooral  Wiener  filter 

Lockheed 

NPGS 

/  Recursive  temporal  Kalnan  filter 

Maximization  of  signal  to  noise  ratio 

Hughes 

NPGS 

SPATIAL- 
TEMPORAL 

Minimization  of  mean  square  error 

NPGS 

Maximization  of  siemal  to  noise  ratio 

NPGS 

CLOSED  LOOP 

ADAPTIVE 

STATISTICAL 

SPATIAL 

Minimization  of  mean  square  error: 
Nonrecursive  spatial  filter 

NPGS 

^axi-iizaticn  of   sicnal  to  rois?  ratio 

NPGS 

TPMPOPil 

Minimization  of  mean  sauare  error 

NPGS   - 

1 

Maximization  of  signal  to  noise  ratio 

NPGS 

*  Techniques  developed  for  tactical   systems 
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These  approaches  are  further  classified  as  determin- 
istic and  statistical.   In  deterministic  cases,  the  filter 
design  is  based  on  non-statistical  properties  of  the  image, 
such  as  its  frequency  characteristics.   In  statistical  cases, 
the  filter  design  is  based  on  statistical  properties  of  the 
image,  such  as  its  autocorrelation  or  power  spectral  density. 

Furthermore,  they  are  classified  according  to  the 
types  of  signal  processing  operations  used:   spatial,  tempo- 
ral, spectral  or  some  of  their  combinations. 
2.   Open  Loop  Adaptive  Filter 

In  our  research  group,  several  nonrecursive  adaptive 
open  loop  adaptive  filters  have  been  developed.   D.  Bar 
Yehoshua  [4]  first  developed  the  nonrecursive  statistical 
spatial  filters  designed  by  a  minimization  of  mean  squared 
error  criterion  using  theoretically  generated  images  based 
on  both  the  first  and  second  order  Markov  models.   These 
images  are  all  assumed  to  have  zero  mean.   D.  Hilmers  [5] 
extended  these  spatial  filters  to  process  real  world  images 
which  have  non-zero  mean.   Further,  he  extended  the  same  con- 
cept to  nonrecursive  statistical  temporal  filters.   B.  Evenor 
[6]  made  two  additional  extensions.   First,  he  developed  the 
design  procedures  for  spatial  filters  based  on  the  maximiza- 
tion of  signal  to  noise  ratio.   Second,  he  developed  a  closed 
loop  adaptive  spatial  filter  by  extending  the  LMS  (least  mean 
square)  algorithm  used  by  many  one  dimensional  adaptive  filter 
researchers.   It  will  be  discussed  further  in  the  next  section 
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Using  several  real  world  infrared  test  images,  these 
open  loop  adaptive  filters  have  been  found  to  be  very  effective 
in  suppressing  background  clutter  for  point  targets.   However, 
they  are  not  responsive  to  any  change  in  the  characteristics 
of  the  image  being  processed. 

3.   Closed  Loop  Adaptive  Filter  and  this  Thesis 

The  realization  of  this  lack  of  true  adaptive  capabil- 
ity led  to  the  study  of  B.  Evenor  [6]  who  developed  the  non- 
recursive  closed  loop  adaptive  spatial  filter  based  on  the 
"LMS"  algorithm,  and  tested  this  approach  by  theoretically 
generated  image  using  Markov  models.   However,  it  was  dis- 
covered that  the  LMS  algorithm  is  actually  a  simplified  version 
of  a  more  general  and  powerful  family  of  closed  loop  adaptive 
filters.   It  was  decided  that  the  first  part  of  this  thesis 
would  be  to  develop  such  a  general  adaptive  filter  approach 
which  includes: 

-  Two  optimization  criteria: 

Minimization  of  mean  square  error 
Maximization  of  signal  to  noise  ratio 

-  General  adaptation  equation  using  gradient  search 
models 

-  A  family  of  nonlinear  searching  techniques  to  carry 
out  the  adaptation  process. 

The  details  of  this  theoretical  study  will  be  presented  in 

Chapter  II. 


24 


C,   IMPLEMENTATION  OF  THE  IMAGE  PROCESSING  PROGRAM 
BY  A  MULTIPLE  MICROCOMPUTER  SYSTEM 

1.  Introduction 

A  parallel  effort  has  been  made  in  the  investigation 
of  practical  implementation  of  these  statistical  nonadaptive 
image  processing  algorithms  developed  in  our  research  group. 
G.  Hilimitzas  [7]  first  investigated  the  execution  speed  and 
accuracy  of  these  image  processing  algorithms  on  a  main  frame 
computer,  IBM  360/67. 

2.  Microcomputer  Implementation 

D.  Becker  [8]  investigated  the  performance  of  imple- 
mentation of  the  nonadaptive  image  processing  algorithms  on 
one  16  bit  LSI-11  microcomputer  and  a  combination  of  this 
LSI-11  microcomputer  and  a  microcomputer  compatible  CDA-MSP-3 
array  processor.   It  was  found  that  using  high  order  language 
programming  and  floating  point  data  format,  today's  microcom- 
puter implementation  is  still  in  its  infancy.   Its  execution 
speed  is  slow  and  not  anywhere  near  any  real  time  processing 
requirements.   Improvements  in  microcomputer  implementation 
by  using  assembly  language  programming,  integer  data  format 
and  improved  programming  on  array  processor  are  currently 
being  developed. 

3.  Multiple  Microcomputer  Implementation  and  this  Thesis 
It  is  obvious  that  to  achieve  real  time  image  proces- 
sing performance  using  microcomputers,  several  improvements 
should  be  considered  simultaneously.   First,  the  processing 


25 


capability  of  individual  microcomputers  must  be  improved  by- 
more  imaginative  programming  and  by  using  attached  special 
processors,  such  as  the  array  processor.   Second,  and  prob- 
ably much  more  important,  is  to  take  advantage  of  the  rapidly 
increasing  number  of  microcomputers  affordable  in  a  system 
by  cleverly  orchestrating  them  into  an  effective  concurrent 
parallel  and  pipeline  execution  of  the  whole  image  processing 
program.   The  advantages  offered  by  the  type  of  multiple  micro 
computer  approaches  do  not  stop  at  faster  execution  only,  but 
also  include  multi-tasking,  higher  reliability  because  of 
better  fault  tolerance.   It  was  decided  that  to  fully  meet 
the  needs  of  new  research  for  the  successful  development  of 
a  smart  sensor,  a  second  part  of  this  thesis  should  address 
the  implementation  issue  of  image  processing  algorithms  by  a 
multiple  microcomputer  system.   Its  details  will  be  presented 
in  Chapter  III. 

D.   SCOPE  AND  EXTENSION  OF  THIS  THESIS 

It  should  be  strongly  emphasized  that  although  this  thesis 
specifically  developed  a  family  of  adaptive  spatial  filters 
for  the  enhancement  of  target  signal  to  noise  ratio  of  images 
and  a  multiple  microcomputer  system  for  the  implementation 
of  the  image  processing,  the  motivation  of  this  thesis  is 
to  contribute  to  the  development  of  smart  sensor  systems. 
Therefore,  the  adaptive  filter  concepts  and  design  techniques 
are  not  limited  to  spatial  filters  only.   They  can 
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be  readily  extended  to  a  wide  class  of  problems  of  poor 
signal  to  noise  ratios.   The  implementation  issue  is  not 
limited  to  adaptive  filter  processing  only.   The  multiple 
microcomputer  system  is  designed  to  implement  not  only  the 
mission  signal  processing  but  also  a  host  of  other  signal/ 
data  processing  tasks  for  management,  command,  control  and 
communication  functions. 
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II.   ADAPTIVE  IMAGE  PROCESSING 

A.   INTRODUCTION 
1.   General 

The  idea  of  an  adaptive  filter  is  inherently  attrac- 
tive.  It  does  not  take  any  stretch  of  imagination  to  see  a 
myriad  of  advantages  offered  by  an  adaptive  filter  which  can 
automatically  update  itself  when  it  is  not  performing  accord- 
ing to  an  optimum  criterion.   The  development  of  adaptive 
filters  started  in  the  early  1960's  when  it  was  extended 
from  the  sampled  data  control  system  [9]  and  when  it  was 
developed  for  adaptive  antenna  applications  [10] .   In  ensuing 
years,  a  large  number  of  investigations  were  made  for  appli- 
cations in  antennas  [11],  noise  cancellation  [12]  and  a 
variety  of  filtering  applications  [13-48] . 

It  is  natural  that  adaptive  filter  concepts  are  very 
attractive  for  the  objective  of  this  thesis--to  detect  very 
dim  targets  deeply  buried  in  infrared  background  clutter. 
However,  a  survey  of  adaptive  filter  research  published  in 
the  70' s  reveals  the  following  facts: 

a.  Practically  all  of  the  past  adaptive  filter 
research  dealt  with  one  dimensional  problems. 

b.  LMS  (least  mean  square)  error  has  been  the  most 
widely  used  criterion.   Very  little  attention  has 
been  given  to  other  criteria,  such  as  the  maximi- 
zation of  output  signal  to  noise  ratio  which  is 
probably  better  suited  for  detection  problems. 
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c.   Very  little  attention  has  been  given  to  the  convergence 
speed  issue  of  adaptive  filters. 

Therefore,  we  decided  to  address  these  three  issues 
and  develop  new  adaptive  image  processing  techniques  which 
are  multi-dimensional,  using  either  the  mMSE  (minimization 
of  mean  square  error)  or  the  MSNR  (maximization  of  signal  to 
noise  ratio)  criterion,  and  using  a  family  of  nonlinear  con- 
vergence techniques  developed  in  the  optimization  field  to 
search  for  the  extremum  in  the  adaptive  process. 

However,  the  basic  concept  of  the  adaptive  filter 
and  the  traditional  LMS  approach  will  be  briefly  reviewed 
first  as  a  starting  point  to  introduce  new  techniques  devel- 
oped in  this  thesis. 

2 .   Basic  Concepts  of  Adaptive  Filters 

The  basic  concepts  of  an  adaptive  filter  can  be 
described  concisely  as  follows: 

The  filter  is  represented  by  a  vector  H.   In  an 
adaptive  filter,  H  is  updated  in  successive  iteration  steps 
described  by  a  subscript  as  H„,  H„+,.   A  correction  term, 
AHK,  is  generated  in  each  iteration  step  such  that 

fiK+1  -  MK  ♦  AHK  (1.0) 

The  iteration  steps  are  carried  out  to  optimize  a  selected 
performance  function  until  the  filter  converges  to  its  steady 
state  which  also  corresponds  to  the  reaching  of  an  extremum 
of  the  performance  function  surface. 
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The  filter  H  could  be  a  temporal  filter  or  a  spatial 
filter.   It  could  be  a  recursive  filter,  also  called  infinite 
impulse  response  CUR)  and  zero/pole  filter,  or  a  nonrecur- 
sive  filter,  also  called  finite  impulse  response  (FIR),  and 
all  zero  filter. 

The  performance  function  could  be  the  mean  square 
error,  or  the  output  signal  to  noise  ratio,  or  other  func- 
tions such  as  the  likelihood  ratio.   The  optimization  objec- 
tive could  be  either  the  minimization  or  maximization. 

In  this  thesis,  two  dimensional  spatial  filters  are 
considered.   They  are  the  nonrecursive  type.   Two  types  of 
cost  functions  are  used.   Their  optimization  objectives  are 
shown  in  the  following  table. 

*  *  TABLE  II. 0 
OBJECTIVE  FUNCTIONS 


Adaptive  filter 

Performance  Function 

Optimization  Goal 

mMSE 

Mean  Square  Error 

Minimization 

MSNR 

Output  Signal  to 
Noise  Ratio 

Maximization 

Let  us  consider  a  nonrecursive  spatial  filter  of  a  filter 
area  of  3  by  3  pixels  which  has  nine  filter  coefficients. 
The  cost  function  is  a  surface  in  a  nine  dimensional  space. 
The  goal  of  the  iterative  adaptation  procedure  is  to  search 
for  the  coordinates  (filter  coefficient  space)  for  the  extreme 
point  (either  a  minimum  or  a  maximum)  of  the  performance 
function  surface. 
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3.   Traditional  Approach  -  LMS  Algorithm 

An  overwhelmingly  large  portion  of  the  past  adaptive 
filter  studies  followed  the  approach  originated  by  Professor 
B.  Widrow  [14]  ,  and  commonly  known  as  the  LMS  (least  mean 
square)  algorithm. 

The  performance  function  used  in  this  approach  is 
the  "mean  square  error."   The  optimization  goal  is  "minimiza- 
tion."  Prof.  Widrow  proposed  that  the  adaptation  term  AH  be 
expressed  as: 

AH  =  2yeX 
where  X  =  signal  being  processed 

2y  =  a  constant,  called  adaptive  gain 

T 
e  =  adaptation  error  =  d  -  H  X 

d  =  reference  (or  desired  signal) 

H  =  filter  coefficient  vector. 

The  adaptation  equation  is  then 

HK+1  =  HR  ♦  2yeX 

A  steepest  descent  search  technique  is  then  used  for  per- 
forming the  adaptation  steps. 

Although  this  traditional  LMS  approach  has  been  used 
by  most  of  the  adaptive  filter  researchers,  it  is  not  without 
certain  drawbacks  which  will  be  briefly  described  as  follows. 

The  adaptation  equation  used  in  the  traditional  ap- 
proach can  be  considered  as  a  special  case  of  a  more  general 
adaptation  equation, 
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an  equation  commonly  used  in  the  field  of  optimization. 
The  term  G_K  is  sometimes  called  the  "gradient"  meaning  the 
gradient  of  the  performance  function  surface.   The  term  a„ 
is  sometimes  called  the  "step  size"  meaning  the  displacement 
in  the  vector  space  H.   The  optimization  procedure  at  itera- 
tion step  K+l  gave  a  filter  vector  HK-+1  which  is  closer  to 
the  optimal  vector  H*  than  previous  filter  vectors.  There- 
fore, Prof.  Widrow's  imaginative  proposal  can  be  interpreted 
as  the  following  two  assumptions: 

— is.  — 

aK  «--►  u  =  a  constant. 

These  two  bold  assumptions  probably  have  resulted  in  several 
inherent  limitations. 

a.  Because  the  gradient  Qv   is  not  tailored  to  the 
performance  function,  convergence  could  be  slow.   Further,  the 
steady  state  filter  result  may  not  yield  the  best  estimation. 
Possibly,  a  steady  state  misadjustment  could  exist  [24]. 

b.  Because  the  step  size  a„  is  assumed  to  be  a 
constant,  the  adaptation  procedure  may  never  reach  a  steady 
state. 

4.   This  Thesis  Research 

In  view  of  the  results  of  the  survey  and  review  of 
the  status  of  the  adaptive  filter  approach  as  presented  above, 
we  identified  a  series  of  research  problems  which  must  be 
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investigated  in  order  to  develop  adaptive  image  processing 
techniques  for  suppressing  background  clutter  in  infrared 
images  and  for  helping  the  detection  of  dim  targets. 

First,  we  must  extend  the  one  dimensional  adaptive 
filter  techniques  based  on  the  mMSE  criterion  to  two  dimen- 
sions. 

Second,  we  should  develop  an  adaptive  filter  based 
on  the  MSNR  criterion  which  is  presented  in  section  B. 

Third,  we  should  develop  a  new  adaptive  equation 
which  is  more  responsive  to  the  performance  function  in  order 
to  improve  convergence  speed  and  to  minimize  steady  state 
misadjustment .   In  other  words,  the  adaptive  equation  is  in 
the  form  of 

The  step  size  a„  and  gradient  G„   will  not  take  the  form  of 
2y  and  eX  as  is  customarily  done  in  practically  all  of  the 
past  adaptive  filter  studies  based  on  the  LMS  algorithm. 

Fourth,  we  will  investigate  a  variety  of  non-linear 
gradient  techniques  to  search  for  the  minimum  in  the  case 
of  mMSE  filter  and  the  maximum  in  the  case  of  MSNR  filter. 
They  are  derived  and  presented  in  sections  C  and  D,  respec- 
tively. 

The  results  of  applying  these  adaptive  spatial  filters 
to  two  real  infrared  images  will  be  presented  in  section  F. 
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B.   DERIVATION  OF  OPTIMIZATION  CRITERIA 
1,   Performance  Function  I  -  mMSE 

The  performance  function  based  on  the  mMSE  criteria 
is  derived  along  with  the  nonrecursive  spatial/temporal  filter. 
The  nonrecursive  spatial  and  temporal  filters  are  described 
by  a  set  of  filter  coefficients,  vector  H  over  the  area  of  a 
"search-box"  .   The  observed  signal  in  the  "ith"  "search-box" 
is  represented  by  the  signal  vector  X. .   The  estimated  target 
intensity  within  the  search-box  S-  is  obtained  by  the  linear 
filter 

S,  =  HT  X-  (2.00) 

This  process  is  carried  out  throughout  the  whole  image. 

2 
The  nonrecursive  filter  is  represented  by  the  vector 

HT  =  [H(l)  ,  H(2)  ,.  .  .  ,  H(N)]  (2.01) 

where  N  is  the  number  of  pixels  in  the  filter  "search-box". 
The  image  signal  within  the  "ith"  filter  "search-box"  is 
described  by  the  vector: 

X*  =  [Xi(l),  Xi(2),...,  Xi(N)]  (2.02) 

Throughout  this  thesis,  matrices  will  be  denoted  by  a  "~" 
under  the  symbol.  Vectors  will  be  denoted  by  a  "_"  under 
the  symbol. 

The  estimation  error  is  defined  as: 


1  See  Fig.  2.0 

2 

T  denotes  the  transpose  of  the  vector 
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e.  =  S.  -  S.  (2.03) 

111  J 

where  S-  is  the  signal  and  S.  the  estimated  signal  in  the 
1  6         1  6 

"search-box". 

The  mMSE  (minimization  of  mean  square  error)  per- 
formance function  is  defined  as: 

J  4  EfEi  •  e?]  (2.04) 

where  E[«]  denotes  the  expected  value.   Substitution  of 
(2.00)  and  (2.03)  into  (2.04)  gives: 

J  =  EtCH1^  -  Si)(HTX.  -  S^1] 

=  E[HTX-XTH  -  2HTX.S.  +  S  2   1  (2.05) 

Since  the  filter  value  is  fixed  for  an  image,  it  can  be 
moved  out  of  the  expectation  operation  to  give: 

J  =  HTE[X.XT  JH  -  2  •  HT.E[X.S.]  +  E[S  2  ]    (2.06) 
—   L— l— l  J—      —    L— l  iJ     L  l  J    v     ' 

In  order  to  simplify  (2.06),  the  following  terms  are 
defined: 

(1)  The  autocorrelation  matrix  R„^.  of  the  observed  image 
is : 

£XX  "  E[^iT]  (2.07) 

Being  a  correlation  matrix,  it  is  a  symmetric  and  positive 
definite  matrix. 

(2)  The  cross  correlation  vector  between  the  observed 
signal  and  the  target  signal  of  interest  is: 

Rxs  =  EIXjSjJ  (2.08) 

(3)  The  mean  square  value  of  the  target  signal  is: 

d  =  EtS^  ]  (2.09) 
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Substitution  of  (2.07)  through  C2.09)  into  (2.06)  gives: 


J  =  -Trxx-  "  2-T-xs  +  d 


(2.10) 


Equation  C2.10)  is  the  performance  function  of  the  mMSE 
criteria.   It  is  a  quadratic  function  in  terms  of  the  filter 
vector  H. 


Estimation 
pixel 
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Figure  2.0   Search  box. 


Theorem  2.01 

The  performance  function  (2.10)  is  a  unimodal  (i.e., 
has  a  single  minimum)  function  if  the  autocorrelation  matrix 
RYY  is  positive  definite. 

Proof 

The  stationary  points  of  the  function  (2.10)  are 
found  by  setting  the  gradient  of  (2.10)  with  respect  to  H 
to  zero. 

VHJ  =  2(RXXH*  -  Rxs)  =0  (2.11) 

Since  RYY  is  a  symmetric  positive  definite  matrix,  its  in- 


verse  exists.   Therefore 
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(2.12) 


Equation  (.2.12)  is  the  optimum  filter  vector  which  minimizes 
the  cost  function  (2.10).   jn  order  to  prove  that  the  cost 
function  is  minimized  for  H* ,  the  second  gradient  of  (2.10) 
with  respect  to  H  is  taken. 

VH(VHJ)  =  Rxx  (2.13) 

Since  Rxx  is  positive  definite,  the  cost  function  is  mini- 
mized.  The  minimum  value  is 

Jmin  "  d  "  Sk  ?XX  "  *XS  (2-14^ 

It  is  obtained  by  substituting  the  optimum  filter  vector 
H*  to  (2.10). 

The  second  derivative  of  the  cost  function  I,  as 
described  in  (2.13),  is  called  the  Hessian  matrix. 

If  the  autocorrelation  matrix  is  singular,  the  cost 
function  (2.10)  is  no  longer  unimodal  because  (2.11)  can  be 
set  to  zero  for  an  infinite  number  of  filter  vectors  H. 

It  can  be  shown  [49]  that  for  such  a  case,  a  minimal 
solution  can  be  obtained  [50,  51]  by  using  the  pseudo  inverse 
of  R, 


■XX* 

-1 
-XX  "XX ^    -XX   -xs 


H*  =  (RYY  RYTY)   •  RYY  •  R, 


The  solution  is  not  unique. 
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2.   Performance  Function  II  -  MSNR 

The  observed  signal  in  the  "search-box"  is  repre- 
sented by  the  vector  X.   Let  us  assume  that  the  target  signal 
vector  S  and  the  clutter  noise  vector  N  are  additive: 

X  =  S  +  N  (2.15) 

Applying  the  linear  filter  H  to  the  input  signal  vector  X, 
we  obtain: 

HTX  =  HT(S  +  N) 

=  HTS  +  HTN  <2'16) 

Let  us  define  the  following  terms: 

S  -  HTS  =  target  signal  after  filtering    (2.17) 

N  -  HTN  =  clutter  noise  after  filtering    (2.18) 
o   —  — 

The  output  signal  to  clutter  noise  ratio  is  then  defined  as: 

T 
The  Power  in  the  filter  image  H  X  due  to  target  signal 

J   £   _ , , Z_^ (2.19) 

-  x 

The  Power  in  the  filter  ijnage  H  X  due  to  clutter  noise 

E[S  2] 

J    =   V"  (2.20) 

E[No2] 

Where  E[«]  denotes  the  expected  value,  substitution  of 
(2.17)  and  [2. 18)  into  (2.20)  gives: 

2 
E[(HTS)  J     E[HTSS_TH] 
J  =  —     =  (2.21) 

E[CHTN)  J     E[HTNNTH] 

The  filter  vector  H  can  be  taken  out  of  the  expectation 

operation. 
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T     T 
H  E[SS  JH 

J  =  = ~  C2.22) 

HTE[NNTJH 

Let  us  define  the  signal  autocorrelation  matrix  as: 

RQq  4  E[SSTJ  (2.23) 

and  the  clutter  noise  autocorrelation  matrix  as: 

RNN  4  E[NNTJ  C2-24) 

RXTXI  and  Rcc,  are  symmetric  and  positive  definite.   Substitution 
of  (2.23)  and  (2.24)  in  (2.22)  yields: 

HTRQQH 
J  =  ~^bb~  (2.25) 

H  RmmH 

—  ^.NN— 

The  performance  function  J  in  (2.25)  is  the  performance 
function  of  the  MSNR  criteria. 

The  filter  vector  H  is  obtained  by  maximizing  J  in 
(2.25)  with  respect  to  the  filter  vector  H. 

Theorem  2.02 

The  maximum  of  the  objective  function  (2.25)  is 
equal  to  the  largest  eigenvalue  of  the  matrix  RNN  •  Rss>  and 
the  optimum  filter  H*  is  the  corresponding  eigenvector. 

Proof 

The  proof  is  based  on  the  Cauchy-Schwarz  inequality 
by  finding  the  upper  bound  of  J. 

Since  the  autocorrelation  matrix  RXTXT  is  symmetric 


% 


NN 


and  positive  definite,  there  exists  a  square  nonsingular 

matrix  V  which  satisfies  the  relation  [52]. 
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Substitution  of  (2.26)  into  C2.25)  and  using  the  fact  that 

V.  V  =  VT  -V1   =  I  (2.27) 


gives 


J  =  — ^bb  -  - (2.28) 

T   T 
«  CVTVj  H 


Let  us  define  the  normalized  vector  W  as: 


A           VH 
w  -         ^~ 

VH 

a,— 

^VH)1.  (VH) 

II   VH  II 

(2.29) 


which  also  satisfies  the  normalization  condition, 

WTW  =  1  (2.30) 

Substitution  of  (2.29)  and  (2.30)  into  (2.28)  gives 

-1    -1 
J  =  WT  VT   RCCV   W  (2.31) 

Let  us  define  the  matrix  P  as: 

a   T"1      -1 

P  £  V1   RCCV  (2.32) 

Equation  (2.31)  becomes 

J  =  WTP  W  (2.33) 

—  %   — 

Using  the  Schwarz  inequality,  we  obtain 

2 
CWTPW)  _<(WTW)  .  (WTPTPW)  (2.34) 

Since  the  left  side  of  the  inequality  is  equal  to  J  ,  the 

2 
right  side  of  (2.34)  is  the  upper  bound  of  J  .   The  performance 
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function  J  reaches  its  maximum  when  the  equality  holds, 
which  occurs  when; 

W  =  a  •  PW  (2.35) 

where  a  is  a  constant.   Substituting  (2 . 29)  and  (2.32)  into 
(2.35)  obtains  (2.36). 

VH               T-l     t       VH        (2.36) 
—  =  a  •  V1   RssY"1  •  

^(VH)1  (VH)  /(VH)T  (VH) 

Multiplying  (2.36)  by 

—  •  VT  ,  (2.37) 


/(VH)T  (VH) 
we  obtain: 

VTV    •    H  =    a    -    VTyT      •    RV_1VH  (2.38) 

Substituting    (2.26)    and    (2.27)    in    (2.38),    we   get: 

RNNH    =    a    •    Rss    •    H  (2.39) 

Since  R,TXT  is  a  positive  definite  matrix,  its  inverse  R,TXT 

~NN      r  '  ~NN 

exists.   Multiplying  (2.39)  by  —  •  RNN  ,  we  obtain: 

(R"?;  •  Rcc  -  -  •  I)  •  H*  =  0  (2.40) 

~NN   ~SS    a    ~    —    —  K  J 

where  I  is  the  identity  matrix. 

Equation  (2.40)  is  called  the  generalized  eigenvalue 
eigenvector  problem  [52] . 

Substituting  the  H*  of  (2.40)  into  (2.25),  we  obtain 
the  maximum  value  of  J.   One  can  see  that 
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Jmax  *  ^  a  -Uax  (2-41) 

In  other  words,  J    is  the  largest  eigenvalue  of  the  matrix 

max  °      ° 

RW  RSS*  anc*  — *  *s  t*ie  corresponding  eigenvector.   The  noise 

correlation  matrix  RNN  can  be  obtained  by  assuming  some  target 

signal  of  interest  S  and  using  the  observed  signal  X  in  the 

following  way  (the  signal  and  noise  are  assumed  additive) . 

RNN  s  E[(X-  S)(X-  S)T] 

"m  (Q.E.D.) 

Theorem  2.05 

The  performance  function  J  in  (2.25)  is  in  general  a 
multi-modal  function. 

Proof 

Based  on  theorem  (2.02),  the  stationary  points  of  the 
performance  function  J  satisfy  the  eigenvector  equation  (2.40) 

c?nn  ?ss  -  k  •  i)  a*  =  0 

In  general,  this  equation  has  n  different  solutions,  because 
the  matrix  RNN  Rss  in  general  can  have  n  distinct  eigenvalues, 
and  thus  n  corresponding  eigenvectors.   So,  in  general,  the 
performance  function  can  have  one  absolute  maximum  and  n-1 
local  smaller  maxima. 

Theorem  2.04 

The  performance  function  J  is  a  unimodal  (has  a  single 

maximum)  if  the  matrices  Rcc,  and  R,TXT  are  defined  as  in  (2.23) 

~bb      ~NN  J 

and  (2.24). 
Proof 

The  proof  is  based  on  the  fact  that  Rss  is  a  dyad. 

Use  equation  (2.40): 

42 


The  matrix  Roc,  being  a  dyad,  can  be  written  as: 

Rcc  =  r  •  rT  (2.42) 

where  r  is  a  vector. 

As  mentioned  before  for  the  nontrivial  solution  of 
(2.40),  the  performance  function 

J  =  -  •  (2.43) 

a 

Using  (2.42)  and  (2.43)  in  (2.40),  we  obtain 

R'l    .  r  r  T  •  H*  =  JH*  (2.44) 

'Separating  (2.44)  into  a  product  of  a  vector  and  a  constant, 
we  obtain: 

(RNNlHlTH*)  =  J  •  H*  (2.45) 

For  generality,  a  constant  3  can  be  used  in  the  left  side 
of  (2.45)  to  give: 

(3  •  RNi  -l)(|-rTH*)  =  J  •  H*  (2.46) 

Comparing  both  sides  of  (2.46),  we  get: 

Jmax  =  \   •    iV  (2.47) 

"*  "  3  •  lm  •   I  (2-48;i 

Equation  (2.47)  shows  that  if  R-o  is  a  dyad,  the 
performance  function  J  has  a  unique  stationary  point 
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where  it  reaches  its  maximum.   The  general  eigenvalue  problem 

has  a  single  non  zero  eigenvalue  J    ,  and  a  corresponding 
&  &         max  r     to 

eigenvector:   H*  =  $  •  RNN  •  r. 

(Q.E.D.) 


C.   DERIVATION  OF  SEARCHING  TECHNIQUES  FOR  EXTREMUM: 

GRADIENT  SEARCH  METHODS  FOR  THE  MINIMUM  OF  THE  mMSE 
PERFORMANCE  FUNCTIONS 

1 .   Steepest  Descent  Method  (SD)  and 
the  Best  Step  Adaptation  Gain 

The  steepest  descent  method  is  a  gradient  method 
which  uses  the  Jacobian  gradient  (G  =  ^o-J)  °^  t*ie  performance 
function  J  to  determine  a  suitable  direction  of  search.  Grad- 
ient methods  which  use  the  Jacobian  to  determine  the  direction 
search  are  called  first  order  methods.   Gradient  methods  for 
optimization  are  based  on  the  Taylor  expansion  of  the  per- 
formance function  J,  as  given  below: 

J(H+AH)  *  J(H)  +  GT.AH  +  ^AHTA.AH  (2.49) 

where  G  is  the  Jacobian  gradient  of  J  and  A  is  the  matrix  of 
second  order  partial  derivatives  called  the  Hessian  matrix. 
Equation  (2.49)  can  be  written  in  the  form: 

J(H+  AH)  =  J(H)  +  AJ  (2.50) 

The  steepest  descent  uses  only  the  Jacobian,  so 

AJ  -  GT.AH  (2.51) 

In  order  to  minimize  the  performance  function  J,  we  want  to 

generate  a  descending  sequence  of  J  which  finally  converges 
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to  the  minimum  of  J,  J*.   In  other  words,  we  want  a  negative 
AJ,  but: 

AJ  *  ||  G  ||  •  ||  AH  II  .  cos  $ 

where  <f>  is  the  angle  between  the  two  vectors,  G  and  AH. 

For  maximum  reduction  of  the  cost  function  J,  $  =  tt     (2.52) 

From  C2.52),  it  is  obvious  that  the  change  AH  in 
the  filter  vector  H  should  be  in  the  direction  of  the  nega- 
tive gradient  -  G.   This  direction  is  called  the  steepest 
descent  direction. 

The  steepest  descent  step  AH  can  be  written  in  the 
form: 

AH  =  -  a-G  (2.53) 

where  -G  is  called  the  step  direction  gradient  and  a  the 
step  size.   In  adaptive  filter  terminology,  a  is  called  the 
adaptation  gain. 

In  order  to  generate  an  iterative  method,  one  can 

represent  the  filter  vector  H  +  AH  as  Hv  ,  and  H  as  H,  . 

r  —     —     -K  +  l      —     —  k 

Thus, 

HK+1  =  HK  +  A-  (2.54) 

Substituting  C2.53)  in  (2.54),  we  obtain: 

£K+1  '   MK  -  aK-GK  (2.55) 

Equation  (2.55)  is  called  the  steepest  descent  iterative 
method.   For  simplicity  and  without  losing  generality,  the 
negative  sign  will  be  included  in  a  .   Thus  (2.55)  becomes: 
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HK+1=HK+aKGK  (2.56) 

If  very  small  values  of  (a^}  are  selected,  the  sequence  {H~} 

will  converge  very  slowly.   In  order  to  increase  the  speed 

of  convergence  substantially,  we  chose  the  step  sizes  which 

provide  the  biggest  descent  each  step.   This  concept  is 

called  the  "best  step".   The  adaptation  gain  av   is  picked  to 

is. 

minimize  J(H„+,).   This  choice  of  a„  constitutes  a  one  dimen- 
sional minimization  of  the  performance  function  J(H„  , )  . 

Lemma  2 . 05 

Let  J(HK)  be  the  performance  function  to  be  minimized 
Let  the  filter  vector  H„+,  be  updated  by  the  steepest  descent 
method  (2.56),  then  the  "best  step"  towards  the  minimum  of 
J  is  obtained  in  every  iteration  if  the  adaptation  gain 
satisfies  the  relationship: 

GKT+1    •  GK  =  0  (2.57) 

where  G„  is  the  Jacobian  gradient  of  J  with  respect  to  the 
filter  vector  H. 

Proof 

The  performance  function  J(HK-+,)  can  be  expressed  as: 

J(MK+1)  =  J(HK  +  aK  GK)  (2.58) 

The  task  is  to  find  a^  which  minimizes  J(HK+,)  by  setting 
the  derivative  of  J(HK+-,)  with  respect  to  a      to  zero. 
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dJ(HK+1) 

—a m    0  (2.59) 

daK 

but  H„  ,  is  a  function  of  a,v   as  shown  in  (2.56).   Thus 
(2.59)  becomes: 

fI.J(HK+1)]  ■  CV„   J)T   d   (H    )  "  0      (260) 
dew  — K+l     daj. 

Since   GVj,    =   V„      J         and     =-(-K+V    =    G„,       (2.60)    becomes: 

GKT+1      GK   =    0  (2.61) 

From  (2.61),  the  best  step  concept  requires  orthogonality 
between  the  two  gradient  vectors,  G„+,  and  G_K. 

(Q.E.D.) 

Up  to  this  point,  the  cost  function  J  was  not  speci- 
fied, and  the  derivation  of  the  steepest  descent  was  made 
for  any  continuous  dif f erentiable  function. 

The  mMSE  performance  function  as  given  by  [2.10] 
can  be  written  as: 

JK  =  *hx&   -    2HTK  Rxs  ♦  d  C2.62) 

The  gradient  G_K  of  J„  with  respect  to  H„  is  given  by: 

From  (2.63)  and  (2.56),  G„,,  can  be  expressed  as: 
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=  2(RXX(HK  +  aK  GK)  -  Rxs) 

=  2^XxSk  "  Sxs)  +  2  •  ?xx  aK  •  GK 


=  ^K  +  2aK  -  ?XX  *  ^K 


Lemma  (2.05)  is  used  in  (2.64)  to  compute  the  best 
step  a„. 

Since  Gvt\  •  Gv  =  0,   Lemma  (2.05) 

T 
(GK  +  2aK  •  Rxx  GK)  •  GK  =  0,   see  (2.64) 

^-K  +  2aX  *  -K  SxX^K  =  °  (2.65) 

T 
R„„  is  a  symmetric  matrix,  thus  R„„  =  R^v.   Using 

this  fact  in  (2.65),  we  obtain 

T 
1    £k  — K 
aK  =  "  1   '  TT7   —  (2.66) 

-K  ^XX-K 

Equation  (2.66)  is  the  equation  of  the  best  step  for  the 
steepest  descent  method. 

Combining  the  results  from  (2.56),  (2.63)  and  (2.66), 
we  obtain  the  steepest  descent  adaptive  filter: 

[Step  1]   Set  a  starting  filter  vector  HQ,  stopping  bound  (i.e., 
max.  acceptable  adaptation  error)  e,  the  correlation  matrix 
RXX  °^  t*ie  observed  signal,  and  the  cross  correlation 


vector  Rye;. 
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[Step  2]   Compute  the  gradient: 
GK  =  2(&CX  %  "  ^XS} 

[Step  3]   Compute  the  adaptation  gain 

T 
1      GK  ^K 

^K  #XX  -K 
[Step  4]   Update  the  filter  vector: 


^K+l  =  ^K  +  aK  '  ^K 


[Step  5]   Test  for  stopping  condition: 

If  gJ    Gv  <    e,  then  terminate.   Otherwise, 
— K   — K  — 

go  to  step  2. 

T 
The  stopping  criteria  is  chosen  as  G„  GK  <_  e  because 

the  performance  function  is  unimodal  (has  a  single  stationary 
point) ,  and  we  are  looking  for  the  stationary  point  which  in 
fact  satisfies  the  vanishing  of  the  gradient. 

2 .   Accelerated  Steepest  Descent  Method  (ASP) 

The  accelerated  steepest  descent  method  was  first 
introduced  in  1964  by  Shah,  Buehler  and  Kempthose  [  53  ] 
Its  purpose  was  to  accelerate  the  convergence  of  the  standard 
steepest  descent  method.   Its  concept  was  incorporated  in  an 
algorithm  which  converges  to  the  minimum  of  any  n  dimensional 
quadratic  function  in  no  more  than  2*n-l  steps.   Practically, 
this  algorithm  is  not  very  efficient  because  of  its  sensi- 
tivity to  error  propagation.   For  large  n,  the  error  propa- 
gation affected  the  convergence  rate  and  the  method  sometimes 
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converges  as  slowly  or  even  more  slowly  than  the  steepest 
descent  method. 

The  adaptation  gain  of  the  ASD  method  is  computed 
using  Lemma  2.05  and  the  fact  that  the  adaptive  filter  H„+, 
is  updated  by  the  iterative  equation: 

MK+i  ■  5k  +  aK  •  vK  C2-67) 

From  Lemma  (2.05)  and  (2.67), 

-K.+  1    '  ^K  =  °  (2.68) 

but    £k+i  =  2(Sxx  5K+i  -  Sxs)   [see  (2-63)] 

=    2(^XXCHK   ♦    aK  VK)-RXS) 

■  2  («XX  5k    -    *XS}    +    2aK  SxX^K 

■  ^K   +    2aK%X  VK  C2.69) 
Using    (2.69)    and    (2.68),    we   obtain: 

C£K   +    2«K  &CX  V    -3£k  (2.70) 


V 


K 


%X-K  (2.71) 


The  accelerated  steepest  descent  adaptive  filter  is 
carried  out  by  the  following  steps: 
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[Step  1]   Set  a  starting  filter  vector,  HQ  =  H,,  stopping 
bound  e,  the  correlation  matrix  R„,r  and  the  cross 
correlation  vector  RvS,  and  the  gradient  GQ. 

[Step  2]   Compute  the  gradient  G_K  of  J. 
^K  *  2C?XX  ^K  '  -XS} 


[Step  3]   Compute  the  step  direction  vector  V^. 

for   K  =  2,  4,  6  

for   K  =  3 ,  5 ,  7 


~K  '  I  Hv  -  H 


-K   -K-2 
[Step  4]   Compute  the  adaptation  gain  a^< 

i   Sk  Xk 


[Step  5]   Update  the  filter  vector  H„. 

aK+i  =  3k  +  aK  xK 

[Step  6]   Test  for  stopping  condition. 

T 
*f  —y      *  £k  —  £ '  terminate.   Otherwise  go  to 

step  2. 

3.   Amir's  Method  (AMM) 

This  method  was  suggested  by  this  author  at  the 
beginning  of  the  research.   The  purpose  was  to  derive  a  method 
which  will  converge  faster  than  the  steepest  descent  method. 
Experiments  showed  that  the  AMM  method  converges  approximately 
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three  times  faster  than  the  SD  method  as  shown  in  Fig.  2.6a. 
This  method  is  a  non-conjugate  gradient  method  and  is  not  as 
fast  as  the  conjugate  gradient  methods.   But  it  can  replace 
the  SD  method  as  a  robust  and  faster  method. 

The  AMM  gradient  search  method  was  designed  based 
on  the  fact  that  the  gradient  of  a  unimodal  performance  func- 
tion vanishes  only  once,  at  the  stationary  point  of  the  per- 
formance function,  which  is  the  extremum  point  we  are  looking 
for. 

The  adaptation  procedure  is  derived  in  the  following. 

The  functional  ¥„  is  defined  as: 

*K  =  GKTGK  (2.71-1) 

where  G„   is  the  gradient  of  the  performance  function  J,  as 
in  (2 . 63)  . 

The  adaptive  filter  HK+,  is  updated  as  given  in 
(2.56)  for  the  SD  method.   The  adaptation  gain  a„  is  computed 
according  to  the  "best  step"  concept,  to  minimize  ¥„+,  . 
Using  (2.64),  we  obtain: 


(2.71-2) 


T 
^K+l  "  -K+l  -K+l 

=  ^K  +  aK  *XX^  *tGK  +  aK  RXXGK) 


*K+1    =   £*    CI  -  2aK   Rxx    +    aK2Rx2x)GK  (2.71-3) 
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In  order  to  get  the  best  step,  we  take  the  derivative  of 
¥-.  ,  with  respect  to  the  adaptation  gain  a„  and  set  it  to 


zero: 


aV/K+1  =    GKT  C-2RXX+  2aK£x2x)GK   -    0  (2.71-4) 

=    2GKTRXXGK    +2aKGKTRx2xGK   =    0 

Solve  (.2.71-4)  and   get: 

T 

_  — JV     <\,AA — A  rn     -,      r--, 

aK   -    -    — -iji (2.71-5) 

-K  JJxX  ^K 

The  AMM  adaptive  filter  is  implemented  by  the 
following  steps: 

[Step  1]   Set  initial  filter  vector  H  ,  the  stopping 

bound  e,  the  correlation  matrix  RYY  and  the  cross- 

r\,KX 

correlation  vector  Rye* 


[Step  2]   Compute  the  gradient  Gv   of  the  performance  function 

— A 
J. 

^K  =  2 (?XX  ^K  "  ^XS} 


V 


[Step  3J   Compute  the  adaptation  gain  a^: 

=  -  -K  %XX  -K 
-K  5xx-K 


[Step  4]   Update  the  filter  vector  H^ 


-K+l  "  -K  +  aK  ^K 
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[Step  5]   Test  for  stopping  condition. 

If  ¥„  <_   e,  then  terminate,  otherwise  go  to  step  2. 

4.   Fletcher-Reeve   Conjugate  Gradient  Method  (CGF) 
The  Fletcher-Reeves  conjugate  gradient  (CGF)  was 
first  introduced  in  1964  by  Fletcher  and  Reeves  [69].   The 
method  is  similar  to  the  pioneering  work  of  Hestenes  and  Stiefel 
[54].   The  CGF  method  uses  conjugate  vectors  as  step  direc- 
tion. 

Definition 

The  vectors  V- ,  V.  are  said  to  be  "conjugate"  with 
respect  to  the  matrix  RYY  if  they  satisfy  the  following 
condition: 

V-T  RYY  V.  =  0   for  i  f   j  and  V.  ,  V.  f    0. 


-i  -XX 


V.  =  0   for  i  f   j  and  V^  V. 


The  importance  of  this  method  is  its  fast  convergence  rate 
for  quadratic  functions  like  (Eq.  2.10).   This  method  is 
proved  to  converge  in  n  steps  apart  from  rounding  errors 
where  n  is  the  dimension  of  the  filter  vector. 

The  adaptation  gain  of  the  CGF  method  is  computed 
using  Lemma  2.05  and  the  fact  that  the  adaptive  filter  H„,, 
is  updated  by  the  iterative  equation  in  (2.67).   Following 
the  equations  (2.67)  up  to  (2.71)  in  a  similar  way,  we 
obtain  the  adaptation  gain  as: 

i     ^ T  y~K 

aY   -  -  4  •   f  * (2.71-6) 

V   R   V 
-K  -XX  -K 
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The  step  direction  vector  V„  is  computed  by  the  following 
iterative  procedure  [55J . 

T 
3K  =  "KV  '  "K+1  (2.74) 

The.  method  of  CGF  was  once  applied  to  the  Rosenbrock 
function  [54] .   The  performance  result  was  poor.   Subsequently, 
it  was  suggested  to  restart  the  method  every  n  iterations, 
where  n  is  the  dimension  of  the  vector  H.   This  thesis  con- 
firmed that  the  convergence  of  this  method  for  our  two  per- 
formance functions  (2.10)  and  (2.25)  is  faster  if  this  method 
of  restarts  is  used. 

The  CGF  adaptive  filter  is  carried  out  in  the  follow- 
ing steps: 

[Step  1]   Select  a  starting  filter  vector  H~ ,  the  stopping 
bound  e,  the  auto-correlation  matrix  Ryx  and  the  cross- 
correlation  vector  Rye- 

[Step  2]   Compute  the  gradient  G^  of  the  performance  func- 
tion  J. 


GK  -  VJ  =  2(RXX  HK  -Rxs) 


[Step  3]   Compute  the  step  direction  vector  V„. 

/-  G_K      if  K  Mod  n  =  0 
— K   )        Gr  G„ 

"  ^k  +  r-TT Ik-i       else' 

^k+i  £k-i 
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[Step  4]   Compute  the  adaptation  gain 
a  -  -  1  ■  "^ 


K         2     Xktrxxvk 


[Step  5J   Update  the  filter  vector  H„  , 


[Step  6]   Test  for  stopping  condition. 

T 
If  Gv  G„   <  e,  terminate  the  adaptation.   Otherwise  go  to 

step  2. 

5.   Pollack- Rebiere  Conjugate  Gradient  Method  (CGP) 

The  Pollack- Rebiere  conjugate  gradient  CCGP  method 
is  similar  to  the  CGF  method.   The  difference  is  in  the  com- 
putation of  the  search  direction  when  K  Mod  n  f    0.   In  [56], 
Powell  gave  a  theoretical  reason  for  favoring  the  Pollack-Rebiere 
algorithm.  In  this  thesis,  the  author  found  the  CGP  method 
more  efficient  and  converging  faster  than  the  CGF  method.  (See 

Section  F) . 

The  search  direction  of  the  CGP  method  is  given  by 

the  following  expression: 

T 

XK*  *  £K  +   *T   r        '  XK-1         <2-75) 

The  CGP  adaptive  filter  is  carried  out  in  the  following  steps: 

[Step  1]   Select  a  starting  filter  vector  H n ,  the  stopping 
bound  e,  the  auto-correlation  matrix  R^x  and  the  cross- 
correlation  vector  Rvv' 
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[Step  2]   Compute  the  gradient  G^.  of  the  performance  function 
J. 

GK  ■  VJ  =  2(RXX  HK  -  Rxs) 

[Step  3]   Compute  the  step  direction  vector  Vv. 


K 


G_K    if  K  Mod  n  =  0 


V-K  =  ^         T 

£K  (^K  '  ^K-l}    „ 
-K    r   T    r  -K-l   else 

£jr-i  "  £k-i 

[Step  4)   Compute  the  adaptation  gain. 

i  £KT-vK 
Ik  ?xx  Ik 

[Step  5]   Update  the  filter  vector  HK+,. 

[Step  6]   Test  for  stopping  condition. 

T 
If  2k  — K  —  e*    terminate  the  adaptation.   Otherwise  go 

to  step  2. 

6.   Davidon  -Fletcher-Powell  Variable  Metric  Method  (DFP) 
One  of  the  most  efficient  searching  methods  is  the 
Davidon -Fletcher-Powell  CDFP) .   It  was  developed  by  Fletcher 
and  Powell  [  57  J  from  the  variable  metric  method  due  to 
Davidon  [54,58],   The  variable  metric  term  was  coined  by 
Davidon  to  describe  methods  which  at  the  K  iteration  utilize 
the  increment  of  the  form 
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*K+1    =   ^K  +    aK  ~K~K 

Ik  Ik 

T 

T 
-K   ~K  -K 

v-k  -  "  ak  £K 

^K   =   ^K+l    "    5K 

AH  =  -  aj(  AK  GK  (2.76) 

and  update  the  metric-correction  transformation  A„  from 
iteration  to  iteration.   The  DFP  method  updates  the  metric 
A„  by  the  iterative  expression: 

(2.77) 

where: 

(2.78) 

(2.79) 

Fletcher  and  Powell  proved  that  for  a  general 
function  J  that  a  positive  definite  A™,  implies  A„+,  is  also 
positive  definite  [  58  ] .   For  the  performance  function  J 
given  in  (2.10),  it  can  be  shown  [59  J  that  the  set 
{aK   A~  •£„}  is  a  set  of  conjugate  directions  so  the  DFP 
exhibits  quadratic  termination  in  n  steps. 

The  adaptation  gain  of  the  DFP  adaptive  filter  based 
on  the  best  step  concept  introduced  in  (Lemma  2.05). 
Using  the  filter  update  of  the  DFP  method: 

Sk+1  =  5K  +  «K  XK  (2.80) 

The  adaptation  gain  is  found  to  be: 


ccT,  =  - 


i   Ik  h 


K     2   V  TR   V 
-K  *XX-K 


58 


The  adaptive  filter  designed  by  the  DFP  method  is 
carried  out  in  the  following  steps: 

[Step  1J   Select  a  starting  filter  vector  H  ,  the  starting 

correction  metric  A  =  I  (where  i  is  the  identity  matrix, 

the  gradient  G  ,  the  stopping  bound  e ,  the  autocorrelation 

- ~o 

matrix  Ryy,  and  the  crosscorrelation  vector  Rye* 
[Step  2]   Compute  the  step  direction  vector  V„: 

[Step  3]   Compute  the  adaptation  gain. 

.   i   ^kT£k 


OLir  "T  • r 

[Step  4]   Update  the  filter  vector  HK< 

[Step  5]   Compute  the  gradient  G„  ,  of  the  function  J 

^K+l  =  2(?XX  -K+l  "  -XS} 
[Step  6]   Compute  the  vector  P„. 
lK   =  £K+1  -  GK 

[Step  7]   Update  the  variable  metric  A„. 

T  T 

A  =     A        +    — **■      — j*  ~  A-  — A.  — JS,     ~  JS. 

~K+1         ~K         VTP  PTAP 

-K    -K  -K  CK-K 
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[Step  8]   Test  for  stopping  criteria. 

If  Gv   ,  •  Gv   ,  <_  e ,  terminate  the  adaptation,  otherwise 
go  to  step  2. 

D.   DERIVATION  OF  SEARCHING  TECHNIQUES  FOR  EXTREMUM, 
GRADIENT  SEARCH  METHODS  FOR  THE  MAXIMUM  OF  THE 
MSNR  PERFORMANCE  FUNCTION 

1 .   Approximation  for  Best  Step  Adaptation  Gain 

The  maximization  of  signal  to  noise  ratio  performance 
function  J,  as  defined  in  (2.25),  is  a  non-linear  performance 
function  of  the  filter  vector  H.   The  function  J  being  non- 
quadratic  introduces  new  difficulties.   The  methods  which 
have  been  theoretically  proved  to  converge  in  N  steps  for 
quadratic  cases  like  the  mMSE ,  no  longer  converge  as  fast. 
The  adaptation  gain  can  no  longer  be  efficiently  computed 
by  the  best  step  concept  because  of  the  large  amount  of 
computation  required  to  obtain  the  best  step.   In  order  to 
make  tnis  gradient  search  method  efficient,  the  adaptation 
gain  is  approximated  by  the  "best  step"  concept  to  generate 
a  nondecreasing  sequence  of  performance  functions  {J„}  which 
finally  converges  to  the  maximum  of  J. 

Lemma  2.06 

Let  the  performance  function  J  be  defined  as  in 
(2.25),  and  the  adaptive  filter  be  updated  according  to 
C2.67),  then  the  best  step  adaptation  gain  at  iteration  step 
K  satisfies  the  relation: 
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^K  ^SS"  JK+1  '  ?NN}^K  (1    ai. 

aK  -  -  — j (2.81) 

^K  C?SS  "  JK+1  ?NN}  -K 


Proof 


The  proof  is  based  on  the  Lemma  2.05.   Using  (2.67)  and 
Lemma  2.05,  we  obtain: 

GKT+1.VK  =  0  (2.82) 

where  £--.  is  the  gradient  of  the  performance  function  J 
at  the  K+l  iteration  step,  and  V~  is  the  step  direction 
(search  direction)  vector.   But  according  to  (2.25), 


T 
T      -K+1?SS  -K+l 
JK+1 


-k+i.5nn  -K+l 


The   gradient   of  JK+1    with   respect    to   H„+,    is 


^K+l        V+i         rt 7~Z (?SS    '    JK+1?NN}  '  ^K+l 

-K+l   HK+1  RNNHK+1 

(2.83) 
Using  (2.83)  in  (2.82),  we  obtain: 


'  "kV^S  '  JK+1  ?NN^  XK  =  0       (2-84) 


-K+1?NN-K+1 


But  according  to  (2.67) 


5k+i  =  Hk  +  °k  •  vk 
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and   Rqc>    R\t\t   being   symmetric   and  positive   definite   gives: 

tHKT     ♦   aKVKTKRsS-   JK+1  •  RNN)-    VK   =    0  (2.8  5) 


.  '  .    SkT  (?SS  "  JK+1  *  ?NN^K  +  aK  •  ll  C?ss  -  JK+1?NN^K   -    0 

So ,  j 

^K    (?SS  '  JK+1  £nN}  ^K  rn    e,> 

a„   =    -    — m (2.86) 

^K    C?SS  "  JK+1  ?NN}  ^K 

Q.E.D. 

The  adaptation  gain  a„  in  (2.86)  cannot  be  obtained 
because  it  is  a  function  of  Jv-+1  which  itself  cannot  be  com- 
puted without  aK-      Thus  the  "best  step"  concept  introduces 
a  nonlinear  problem  for  the  MSNR  performance  function.   In 
order  to  overcome  this  problem  of  solving  a  nonlinear  equation 
in  each  iteration,  the  adaptation  gain  will  be  approximated 
by  using  J„  instead  of  Jy-+l'      Since  JK  is  obtained  one  step 
prior  to  o.v  ,       Jv.i    does  not  need  to  be  solved.   Now  we  must 

A         J\.+ 1 

prove  that  this  choice  of  adaptation  gain  for  the  MSNR  per- 
formance function  will  generate  a  nondecreasing  sequence 

{Jv}   which  eventually  converges  to  the  optimum  J 
is.  max 

Lemma  2.07 

Let  the  performance  function  J  be  defined  as  in  (2.25) 

and  the  filter  H,,  be  updated  by  (2.67),  let  the  adaptation 

gain  aK  be  given  by 

T 
-K  C?SS  "  JK  ^NN-1  -K 

aK  "  "  T (2.87) 

^K  (?SS  "  JK  ?NN}  ^K 
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Then  it  will  generate  a  nondecreasing  sequence  {J„}  which 
converge  to  the  maximum  J, 

Proof 

Using  (2.25),  we  obtain: 

T 
.    =  -K+l  -SS  -K+l  (2>88) 

"■  ■*■     H      R     H 

-K+l  ~NN  -K+l 
Substitution  of  (2.67)  in  (2.88)  gives: 

3  __     CHK^aKVK)TRss(HK+aKVK) 


K+1  (HK  +  aKVK)TRNN(HK+aKVK) 


(2.89) 


T  T 

V       R  H  V       R  V 

-K   ~SS   -K  2       -K   ~SS   -K 


=  Jr 

T           K 
-K  ?SS  -K 

T 
-K  ?SS  -K 

l  +  2aK 

T 
V   R     H 

-K  ~NN  -K     2 
^K  ?NN  -K 

T 
V   R   V 

-K  ~NN  -K 

T 
-K  ?NN  -K 

Equation  (2.89)  is  simplified  due  to  the  fact  that  R^c-,  RNNI 
are  symmetric  and  positive  definite. 

In  order  to  obtain  a  non-decreasing  sequence  {J^K  we 
must  satisfy:   JK+,  >  J^.,  but  the  sequence  {J„}  is  positive, 
so: 

-4^  >  1  (2.90) 

JK 

In  order  for  (2.80)  to  satisfy  (2.90),  it  can  be  seen 

that: 
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T  T 

,    -   ^K  ?SS  -K  2  ^K  ?SS  ^K  s  ,  k  ,    -K  ?NN  -K 

1  +  2ctK' — *< +  a^  j >  1  +  2a„  — * 

-K  ?SS  -K  -K  ?SS  ^K  -K  ?NN  -K 


♦  aK2  "*  ?NN  "K  (2.91) 

-K  ?NN  -K 


Using  (2.91),  (2.25)  and  the  fact  that  RNxj>  Rcc  are  positive 
definite  matrices,  we  obtain: 

2'2*  ?SS  SK  -  »K  •  VKT  Rss  VK  >  JK(2  VKT  RNN  HK  ♦  a^v/  R^  VR) 

(2.92) 

V^K    CJsS  "  JK  ?NN^K  i   "    2^K    C?SS  "  JK   ?NN^K 

T 

,   ~   -K  C?SS  '  JK  ?NN}-K  cn    Q,, 

aK  >_  -2  •  — * (2.9j) 

VK  (?SS  "  JK  ?NN)VK 

So  the  adaptation  gain  given  in  (2.87)  generates  a  non- 
decreasing  sequence  {J„}  because  it  satisfies  (2.93). 

Q.E.D. 

Lemma  2.08 

Let  the  performance  function  J  be  defined  as  in  (2.25) 
and  the  filter  vector  H„  being  updated  by  (2.67),  then  for 
each  iteration  step  K,  the  gradient  G^  of  J  is  orthogonal 
to  the  filter  vector  H„  regardless  of  the  adaptation  gain. 
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Proof 

The  performance  function  given  by  (2.25)  is 

T  T 

K  "   T 

%  ?NN  -k 

The  gradient  of  J„  with  respect  to  the  filter  vector  H„  is 
given  by 


(2.83) 


From  (2.83),  it  follows  that: 


Si  '  gk  =  t'  H  •  ^  C?Ss  •  jk  Snn>  '  aK  <2'  94> 

-K  *NN  ^K 

HKTGK  =  2  •  (  =*  ~SS  -K  -  JK  S*  ~NN  -K  )     (2.95) 
-K  ~NN  -K      -K  &NN  -K 

Using  (2.25)  in  (2.95),  we  obtain: 

HKT  GK  =  2  •  (JK  -  JK)  =  0 

Therefore,   H^  GK  =  0  (2.96) 

Thus  the  filter  vector  Hj,  at  iteration  step  K  is  ortho- 
gonal to  the  gradient  G„   of  the  performance  function  J. 

Q.E.D. 
2.   Steepest  Descent  Method  (SD) 

The  steepest  descent  (SD)  method  as  described  for 
the  quadratic  mMSE  perform  function  can  be  used  here  for  the 
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MSNR  performance  function  with  some  exceptions: 

The  concept  of  the  "best  step"  is  used  by  an  approx- 
imation of  the  best  adaptation  gain.   The  gradient  of  the 
performance  function  with  respect  to  the  filter  vector  is  a 
function  of  the  performance  function.   Thus,  successive 
values  of  performance  function  must  be  computed. 

The  adaptation  equation  used  here  is  identical  to 
(2.56) . 

^K+l  =  HK  +  aK  *  ^K 

The  adaptation  gain  is  obtained  from  (2.87)  by  replacing 
the  step  direction  vector  V„  with  the  gradient  G„  (the 
direction  of  the  SD) . 

The  adaptation  gain  obtained  is: 

T 
=  .  -K  (?SS  ]  JK  ?NN}-K 

-K  C?SS  "  JK  ?NN}-K 
The  matrix  Q„  is  defined  as: 

9k    ?SS  "  JK  ?NN  (2.98) 

Substituting  (2.98)  into  (2.99),  we  obtain: 

aK  "  "   T    "  (2.99) 

^K  Sk'^K 

The  adaptive  filter  designed  by  the  SD  method  is  carried  out 
in  the  following  steps: 
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[Step  1J   Select  a  starting  filter  vector  H  ,  and  a  stopping 
bound  6 . 

[Step  2]   Compute  the  performance  function  J,,  as  in  C2.25) 

T 
3     _   -K  ?SS  -K 


-K  ?NN  -K 

[Step  3]   Compute  the  gradient  G_K  =  VH  JR: 

— K 

2 
G   =  — ?s •  Q„  .  H„ 

where  Q„  is  given  by  (2.98). 

[Step  4]   Compute  the  adaptation  gain: 

T 
JJK  ?K  ^K 

aK  "  "  — T 

^K  ?K  ^K 

[Step  5]   Update  the  filter  vector  HK 

[Step  6]   Test  for  stopping  condition. 

If  1 1  ~ rc*J- — ~K  1 1    $  ,  then  terminate  the  adaptation. 
Otherwise,  go  to  step  2. 

The  stopping  condition  is  different  from  the  one  used 
for  the  mMSE  criteria  because  in  this  case  the  gradient  G„ 
is  a  nonlinear  function  of  the  filter  Wv   and  when  Hv  ■>  oo  the 
gradient  Gv  +   0  (use    C2.83)  to  verify).   Thus,  the  gradient 
does  not  necessarily  vanish  at  the  stationary  point,  but  can 
vanish  when  the  system  diverges. 

67 


3.   Accelerated  Steepest  Descent  Method  (ASP) 

The  ASD  method  derived  in  (II. C. 2)  is  applied  in 
this  section  with  some  modifications  to  design  an  adaptive 
filter  which  maximizes  the  performance  function  J  in  (2.25) 
The  adaptive  filter  is  updated  according  to  (2.67). 

The  step  direction  vector  V„  is  computed  from  the  filter 
vector  H„  and  the  gradient  G^  of  the  performance  function 

J|C. 


1 

for   K   =    2,    4, 

6    .  .  . 

XK=< 

1 

'  HK    -    HK_2 

K   =    3,    5, 

1    ... 

The  gradient  G„  is  obtained  from  (2.83)  and  the  adaptation 
gain  from  (2.87)  and  Lemma  2.07. 

The  adaptive  filter  designed  by  the  ASD  method  is 
carried  out  in  the  following  steps: 

[Step  1]   Set  a  starting  filter  vector  H   =  H,  ,  stopping 

bound  6  and  compute  the  performance  function  J   and  the 

gradient  G  . 
°        — o 

[Step  2]   Compute  the  performance  function  value  at  itera- 
tion step  K. 

T 
j      _   ^K  ?SS  ^K 


-K  ?NN  -K 
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[Step  3]   Compute  the  gradient  G_K  of  JK  with  respect  to  H 


K 


where  §K  is  given  by  (2.98) 
[Step  4]   Compute  the  step  direction  vector  V^ 


XK  = 


-  Gv  for  K  =  2,  4,  6 


HK-  HK_2   for  K  =  3,  5,  7 


[Step  5]   Compute  the  adaptation  gain: 

T 

£k  2k  ^K 

a 


^K  §K  ^K 
[Step  6]   Update  the  filter  vector: 

aK+i  -  &  ♦  °K  •  *k 

[Step  7]   Test  for  stopping  condition: 

If  -U rr—n n  I'  <  6   then  terminate  the  adaption, 

II  EK  II    ~ 
otherwise  go  to  step  2. 

4.   Fletcher-Reeves   Conjugate  Gradient  Method  (CGF) 

The  Fletcher-Reeves   conjugate  gradient  (CGF)  method 
is  applied  to  the  MSNR  adaptive  filter  in  a  similar  way  as 
for  the  mMSE  adaptive  filter.   However,  the  nonlinear  MSNR 
performance  function  requires  more  computation  and  does  not 
use  the  true  "best  step"  but  an  approximation.   The  "restart" 
concept  was  used  and  found  to  be  able  to  accelerate  the  con- 
vergence speed.  £o 


The  adaptive  filter  based  on  the  CGF  method  is 
updated  by  the  following  iterative  scheme: 

The  step  direction  vector  VK  is  obtained  as  in  (II. C. 4)  by 
the  expression: 

I     -  GK     if  K  Mod  n  =  0 
-K   )  rTr  (2.101) 

Sk-i£k-i 


The  adaptation  gain  a^  is  obtained  from  Lemma  (2.07)  and 


given   by 


T 

^K    C!?SS  '  JK  '  ?NN}    ^K  r9    ln,. 

Olv    ■    "    — j (Z.1UZJ 

^K  C?SS  "  JK  "  ?NN}  ^K 


Using  definition  (2.90),  we  obtain: 

_  .   H*  Cv  VK 
aK  "    ~KT  ~K  ~K  (2.103) 

ll  8k  Xk 

The  adaptive  filter  designed  by  the  CGF  method  is 
carried  out  in  the  following  steps: 

[Step  1J   Select  a  starting  filter  vector  H  and  a  stopping 

bound  6 . 
[Step  2]   Compute  the  performance  function  J^.. 
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j  -K  *SS  -K 


^K  ?NN  -K 


[Step  3]   Compute  the  gradient  Gj,  of  J„  with  respect  to  PL, 

^K  *NN  -K 
where  QK  is  given  by  (2.98). 
[Step  4]   Compute  the  step  direction  vector  VK. 

Gv  if  K  Mod  n  =  0.   . 


-*-{ .r.^ 


r  T  I '  SK-1    Else, 

^K-l^K-1 


[Step  5]   Compute  the  adaptation  gain. 

tJKT  9K  ^K 

aY    ~     "  T 

XKT  9k  ^k 

[Step  6]   Update  the  filter  vector  HK< 

&K+1  =  *K  +  aK  ^K 

[Step  7]   Test  for  stopping  condition. 

H— K+ 1  "  — K  1 1 
— PI — n n —   1  ^   then  terminate  the  adaptation. 

II  £k  H 
Otherwise  go  to  step  2. 

5.   Ibllack-Rebiere  Conjugate  Gradient  Method  (CGP) 

The  CGP  method  is  similar  to  the  CGF  method.   The 

only  difference  is  the  way  the  step  direction  is  computed 
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The  CGP  method  uses  the  following  expression  to 
compute  the  step  direction  vector  VV. 


H 


GK  if   K  Mod   n   =    0 


£K    C^K"  ^K-l} 


h  +  -rr— Ik-i      else> 


SjC-lSjC-l 


All  the  rest  is  identical  to  the  CGF  method.   However,  this 
method  was  found  to  converge  much  faster  than  the  CGF  for 
all  the  images  tested  in  this  thesis. 

The  adaptive  filter  designed  by  the  CGP  method  is 
carried  out  in  the  following  steps: 

[Step  1]   Select  a  starting  filter  vector  H  and  a  stopping 

bound  6 . 
[Step  2]   Compute  the  performance  function  J^. 

T 
j      _     -K  ?SS  -K 


-K  ?NN  -K 


[Step  5]   Compute  the  gradient  G„   of  J^.  with  regard  to  H„. 


— K     on     o       "^  — K 

-K  ?NN  % 
where  QK  is  given  by  (2.98) 

[Step  4J   Compute  the  step  direction  vector  V^. 

— A. 

/  -  GK      if  K  Mod  n  =  0 

~~ K   I  Q.Y    C  G  v  -   G  v   , ) 

-   £K  +    x    '  VK-1    Else, 

£.k-i'£k-i 
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[Step  5]   Compute  the  adaptation  gain. 


Sr  Qv  L 


-K  ^K  -K 
aK      ,rT   ' 


XK  9k  Xk 

[Step  6J   Update  the  filter  vector  H„. 

^K+l  =  ^K  +  aK  ^K 
[Step  7]   Test  for  stopping  condition: 

U—  K+l  "  — K  II 
— j-j— n n — **   <  <5   then  terminate  the  adaptation, 

II  -K  H 
otherwise  go  to  step  2. 

6.   Davidon-Fletcher-Powell  Variable  Metric  Method  (DFP) 

The  DFP  method  is  applied  to  the  MSNR  adaptive  filter 

in  a  similar  way  as  for  the  mMSE  adaptive  filter.   The  major 

difference  is  the  approximation  of  the  adaptation  gain  and 

the  need  to  evaluate  the  performance  function  at  every  iter- 
ation step  X. 

The  adaptive  filter  based  on  the  DFP  method  is  updated 
by  the  following  iterative  scheme: 

HK+1  =  ^K  +  aK  "  ^K 

The  step  direction  vector  V_K  is  obtained  by  the  variable 
metric: 

^K  =  -  6(  '  ^K  (2.105) 

The  adaptation  gain  is  obtained  from  Lemma  (2.07): 
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CC   = Sp^ — - —  (2.106) 

Ik  9k  Xk 

where  Q„  is  given  by  (2.98).   The  metric  A„  is  updated  by 
the  DFP  iterative  procedure: 

vv  vv         Av  pk   pr  Av 

*K+1  "  *K  +  aK  -7T7-  "   pT.    ^2-107^ 

-K  -K    -K  ~K  -K 


The  vector  P„  in  [2.107)  is  defined  as: 

^K  =  °K+1  "  GK  (2-108) 

The  adaptive  filter  designed  by  the  DFP  method  is 
carried  out  in  the  following  steps: 

{Step  1]   Select  a  starting  filter  vector  H  ,  the  starting 

correction  metric  A  =  I  (where  I  is  the  identity  matrix) , 

~o   ~        ~  J 

compute  the  gradient  G   of  the  performance  function  as 
before,  set  the  stopping  bound  6. 
[Step  2]   Compute  the  step  direction  vector  V„. 

Ik  -  -   *k  °k 


[Step  3]   Compute  the  adaptation  gain  a 


r 


£k  9k  ^k 
K     ll  9K  IK 


(Xv    ~ 


[Step  4]   Update  the  filter  vector  H«. 


Hk+1  =  ^  +  aK  ^K 
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[Step  5]   Compute  the  performance  function  J„+, 


T 
j  -K+1?SS  -K+l 


K+1      HT   R    H 

-K+l-NN  -K+l 


[Step  6J   Compute  the  gradient  G„+1  of  the  performance 

function  J„  ,  with  respect  to  the  filter  vector  H«+^. 

^k+i  =  -zr1: — Sk+i  *  SK+i 

-K+l-NN  -K+l 
where  Qv-+1  is  given  by  updating  (2.98). 

[Step  7]   Compute  the  vector  I>  by  (2.108). 

^K  =  ^K+l  "  ^K 

[Step  8]   Update  the  correction  matrix  A^  ,  by  (2.107). 

T  T 

V   V      A   P   P   A 

\+l     =  \    +  «K    '        T      "     T  (2.107) 

~K+1    ~K     K    V   P       PAP 

-K  -K     -K  ~K  -K 

[Step  9]   Test  for  stopping  condition. 

This  step  can  be  done  after  step  4  to  save  some  extra 
computations  but  was  placed  here  to  follow  the  consistent 
pattern  as  all  other  methods. 

II  Ek+I  "  — K  II 
If  ''  ||  n n "-   £  <5   then  terminate  the  adaptation  pro- 
cedure, otherwise  go  to  step  2. 


7  5 


7.   Amir's  Transform  Method  CAT) 

From  test  results,  which  will  be  shown  in  (2.17  — 
2.28),  it  was  observed  that  a  faster  convergence  method  will 
be  helpful  for  designing  the  MSNR  adaptive  filter.   Both  the 
conjugate  gradient  method  by  Pollack  and  the  variable  metric 
method  following  Davidon  do  not  exhibit  the  same  convergence 
speed  as  for  the  quadratic  mMSE  case.   The  reason  for  this 
slower  convergence  for  the  MSNR  performance  function  is  the 
nonlinear  nonquadratic  performance  function  as  shown  in  (2.25), 
It  was  then  decided  to  derive  a  method  tailored  for  this 
performance  function.   The  derivation  of  this  method  is  based 
on  the  generalized  eigenvalue/eigenvector  problem  introduced 
by  the  stationary  points  of  the  performance  function  J  in 
(2.25).   The  stationary  point  of*  the  performance  function  J 
in  (2.25)  satisfies  (2.40)  which  can  be  written  in  the  form: 

«Ml  %X  "  W>*  2*  -  0  (2.109) 

where  H*  is  the  optimal  filter  vector  which  maximizes  the 
performance  function  J. 

The  optimal  filter  H*  satisfies 

2*  =  r    -Smj'Sss'S*  (2-n°) 

max 

From  equation  (2.110),  it  is  obvious  that  an  adaptive 

filter  designed  by  using  the  transform  matrix  -y  •  R^-  R^ 

for  updating  the  filter  will  satisfy  (2.110)  if  it  converges 

to  the  optimum. 
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In  order  to  accelerate  the  convergence  of  such  an 
adaptive  filter,  a  gradient  search  is  added  to  update  the 
filter  vector.   The  steepest  descent  search  direction  is 
adopted.   The  "best  step"  concept  is  used  partially  to  com- 
pute the  adaptation  gain. 

At  iteration  step  K+l,  the  filter  update  equation 
is  described  by: 

H      =      •  E    *  E    •  H   +  01   *  C 

-K+l    JK   *NN   ~SS   -K     K   -K         (2.111) 
The  transform  matrix  M„  is  defined  as : 

^K  6  JK-  ?NN  •  ?SS  <2-112' 

Using  (2.112)  in  (2.111),  we  obtain: 

The  adaptation  gain  for  the  AT  method  can  be  obtained  by 
Lemma  2.05. 


-K+l'  -K  =  °  (2.57) 


From  (2.83)  and  (2.98),  the  gradient  G„  ,  of  J„  .  is- 

— K.+ 1       JS.+ 1     ' 


-K+l    TTT 


HT  R   H       '  Qrxi  *  Hrxi  (2.114) 

^k+irnA+i      K+1  ~K+1 
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Using  (2.113)  and  (2.114)  in  (2.57),  we  obtain: 

I   T  I Sl+1   %  «K  +  *K  ^  •&  "  ° 

-K+l  ~NN  -K+l 

(2.115) 

T 
&X+1   ?K  Sf  +  aK?K+l  ^  '  ^K  =  °      ^2'116) 

(2.116)  can  be  viewed  as  a  dot  product  between  two  orthogonal 
vectors,  and  the  expression  can  be  modified  to  be: 

Solving  (2.117)  for  aK,  we  obtain: 

^K   ?K+1  ^K 

Since  Q«-+1  is  not  known,  we  use  Lemma  (2.07)  to  approximate  aK 

G  T  Q  M  H 
aK  =  -  "KT~K  ~K  "K  (2.119) 

In  order  for  the  adaptation  gain  aK   in  (2.119)  to  be  accept- 
able (i.e.,   the  adaptive  filter  will  converge), it  must 
satisfy  the  condition  (2.90). 

The  adaptive  filter  designed  by  the  AT  method  is 
carried  out  in  the  following  steps: 

[Step  1]   Select  a  starting  filter  vector  H  ,  and  a  stopping 
bound  6 . 
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[Step  2]   Compute  the  performance  function  J„  as  in  (2.25). 


T 


^K  ?NN  ^K 


[Step  3]   Compute  the  gradient  G„  =  VH  J  . 

— K 

2 

""  *■    H   R    H       ~A  — K 


[Step  4]   Compute  the  adaptation  gain. 

^K  §K  ^K 

av   "  "  — T 

£k  9k  £k 

[Step  5]   Update  the  filter  vector  H„  according  to  (2.113) 

SK+1  *  *K  h  *   aK  ^K 

[Step  6]   Test  for  stopping  condition. 


II H     -  H  II 

If   H— K+l  —  K11  <.  6,  then  terminate  the  adaptation, 

l|HKll 

otherwise  go  to  step  2. 


E.   CONVERGENCE  AND  CONVERGENCE  RATE  OF  THE  GRADIENT  METHODS 
1.   SD  Adaptive  Filter 

Theorem  2. 09 

For  any  starting  filter  vector  Hn ,  the  sequence 
{H. }  of  the  adaptive  filter  given  by  (2.56)  converges  to 
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the  unique  optimal  solution  H*  given  by  (2.12).   Further- 
more, the  rate  of  convergence  satisfies 

a  ||  5m  -  a*  ||   i  .   i^_ 
II  HK  -  H*  ||    6    1  +  C 


where  C  is  the  condition  number  of  the  Hessian  matrix  Ry^ 
of  the  performance  function  J  given  in  (2.10)  and  6  is  a 
constant.   The  condition  number  is  defined  as 

C  2  -j*-  (2.121) 


where  AT ,  A<,  are  the  largest  and  smallest  eigenvalues  of 
the  Hessian  matrix  Ryy 

Proof 

The  Kantorovich  inequality  is  used  to  prove  the 


theorem. 


The  functional  ¥„  is  defined  as 


^K  =  (^K  "  -*)T  '  ^XX(^K  "  £*>  (2.122) 

where  RYY  is  the  Hessian  matrix  of  the  performance  function 
in  (2.10).   For  the  adaptive  filter  Rxx  is  the  correlation 
matrix  of  the  observed  image  signal. 

H*  is  the  filter  which  minimizes  (2.10).   The  filter 
vector  H*  is  called  the  optimal  filter.   ¥  is  updated  at 
iteration  step  K  +  1  as: 
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Vi  =  %c+i  "  **r  *xx<SK+i  "  5*) 


(2.123) 


Us 


ing  (2,56)  to  substitute  for  H„  -  in  (2.133) 


we  have 


Vl  "  UiK  +  «K  ^  "  5*)  ?XX^K  +  aK  ^K  "  *V 


(2.124) 


Using  the  definition  (2.122)  and  the  fact  that 
R„„  is  a  symmetric  matrix,  we  obtain: 

Vl  ■  *K  +  2aK  $2     *  ?XX^K  "  H*5  *  aK2  G^  Rxx  GR 


The  adaptation  gain  av   is  given  by  (2.66),  which  can 
be  used  in  (2.125)  to  obtain: 


K+l 


*k  +  2*k  •  £k!xxCSk-£*3 


1 

2 


a 


^K  ^K 


K 


-K  ?XX-K 


-K  ?XX  -K 


=  *K  +  2-K^KT?XX^K-»^  "  IaK  "  Sj^ 


(2.126) 


Using  (2.63)  and  (2.12)  in  (2.126),  we  obtain: 


2  •  5xx(HK-H*) 


2  (?XX  -K  "  ?XS  +  -XS 


0 


?x* 


H*) 


£k 


(2.127) 
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Substitution   of    (2.127)    in    (2.126)    gives 


*K   =  \+aK^*&4aK  ^^K 


*K   +   TaK  ^K 


(2.128) 


Let  us  define  the  vector  E^  as 


*K  -  MK  "  »* 


(2.129) 


Using  this  definition  (2.129)  in  (2.122),  we  obtain: 


^K  :  -K  ?XX  -K 


(2.130) 


From  (2.127),  we  obtain: 

1    "I 
f  =  —  •  r   r 
-K    2   -XX  -K 


(2.131) 


-XX 


Using  (2.131)  and  the  fact  that  RYY  is  symmetric,  (2.130) 
became : 


*K  =  J  ■   £KT  ?XX  5k  (2-132) 


Using  (2.132)  in  (2.128),  we  obtain 


* 


K+l    XK 


¥ 


K 


1  „        rT  r 

2  aK  '  -K  -K 

J  •  £K  RXX  ^K 


(2.133) 


Substitution  of  av   given  by  (2.66)  in  (2.133)  gives: 


4! 

K 


^K  ^K 


^K  ^K 


T  -1         T 
^K  ^XX  ^K    ^K  -XX  ^K 


(2.134) 


82 


Now  the  Kantorovich  inequality  is  used: 

T  T  -1 

•£k  ?XX  -K}  C^K  ?XX  sje 


T  T  -1  2 

(Gr  RYY  Gr)  (Gr,  RYY  G,,)   (^A  <-.  +  At) 


<_  — ^_ 1 —  (2.135) 

^K^  4AS  •  AL 

where  A ~   and  AT  are  the  smallest  and  largest  eigenvalues  of 
the  matrix  R^v. 

Using  (2.135)  in  (2.134),  we  obtain: 

-£U < — j  (2.136) 

*K        <AS  +  Al/ 

uT^i   (  S   +  >   )   I1  (2.137) 

*K        AL    AS 


Again,  the  condition  number  of  the  matrix  R„„  is  defined  as 

C  £        -~-  (2.138) 

AL 


(2.137)  became 

Y  2 

K 

Since  Rxx  is  a  positive  definite  matrix,  the  sequence  iVy-} 

is  a  positive  sequence. 

Let  us  define 

2 

q2  £  (}— £)  £  1  (2.140) 

we  obtain  ? 
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From  (2.141),  we  can  see  that  when  k  -*■  °°,  the 
sequence  {W^.}  converge  to  zero.   The  reason  is  that  we  have 
a  decreasing  positive  sequence  {V v}  ,    thus  ¥   =  0.   It  implies 
E^  =  0_  (use  C2.130)  to  justify  this  statement).   Since  E„ 
is  defined  as  H  -  H*  in  (2.129),  we  conclude  that: 


H  -  H*  =  0 


or 


H   =  H: 


(2.142) 


This  completed  the  proof  of  convergence. 

From  (2.139),  we  observe  that  the  rate  of  convergence 
of  the  sequence  iV*,}    is  given  by  (2.140).   However,  ¥„  as 
defined  in  (2.122)  is  a  quadratic  function  of  the  vector 


E.K  =  —  K  "  —  * '  **  satisfies  the  relation 


K+l 

K 


=  S 


H  ^K  +  l  -  S* 
II  HK"M*  II 


(2.144) 


where  3  is  a  positive  number. 
Thus  we  obtain: 


|*JK+1-S*1| 


«K   * 


1     1  -  C 

^TTr  ^3   i  +  c 


(2.145) 
(Q.E.D.) 


Theorem  (2.09)  proves  that  the  SD  method  exhibits 
at  least  linear  convergence. 

Definition 


An  algorithm  with  the  property  that 

=  constant   is  said  to  exhibit  linear  con- 


H    -  H* 
-K+l   - 


SK  "  H* 


vergence. 
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The  linear  convergence  is  sometimes  called  geometric 
convergence  since  it  follows  from  the  definition  that  for 
large  K,  j 

•||  HK  -  H*  ||  ,aK"J  ||H.  -  H*  || 

The  speed  of  convergence  of  the  SD  method  is  a  function  of 
the  condition  number  C.  The  more  ill-conditioned  R„x,  the 
slower  will  be  the  rate  of  convergence. 

Theorem  (2.09)  used  the  mMSE  quadratic  function. 
For  the  MSNR  performance  function,  it  was  shown  (II.E.l)  that 
the  sequence  (H^}  generated  by  the  SD  method  converges.   Test 
results  showed  that  the  convergence  of  SD  is  slower  for  MSNR 
than  mMSE. 

2.   ASP  Adaptive  Filter 

The  algorithm  is  illustrated  in  Fig.  2.01. 


Fig.  2.01  The  ASD  algorithm, 
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The  steepest  descent  steps  are  labeled  SD,  the  accelerated 
steps  are  labeled  ASD. 

Shah,  Buehler  and  Kempthore  [53]  showed  that  for  an 
n  dimensional  quadratic  function,  the  sequence  of  iterates 
Hq,  H2,  H4  ..,  is  identical  to  the  full  sequence  of  iterates 
generated  by  the  conjugate-gradient  descent.   Since  the  con- 
jugate gradient  descent  takes  no  more  than  n  steps  to  reach 
the  minimum  of  the  n  dimensional  quadratic  function,  the 
accelerated  steepest  descent  takes  no  more  than  (2n-l)  steps 

Applying  the  ASD  method  to  design  a  multidimensional 
adaptive  filter  using  real  test  screen  images  has  shown  poor 
convergence  speed  for  both  the  mMSE  and  MSNR  performance 
functions.   The  reason  is  due  to  error  propagation.   These 
methods  are  sensitive  to  error  propagation,  which  do  not 
satisfy  the  condition  for  accelerated  convergence. 
3.   CGF  Adaptive  Filter 

The  conjugate  gradient  methods  CGP  and  CGF  exhibit 
quadratic  termination  (apart  from  rounding  errors)  for  the 
mMSE  performance  function.   Quadratic  termination  means  that 
for  a  quadratic  performance  function  it  is  guaranteed  that 
the  minimum  will  be  located  exactly  (apart  from  rounding 
errors)  in  no  more  than  n  steps.   However,  for  nonquadratic 
functions  like  (2.25)  the  conjugate  gradient  method  does  not 
exhibit  quadratic  termination.   For  the  infinite  dimensional 
case,  Daniel  [60]  showed  that  the  rate  of  convergence  is: 
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=  ||HKtl-H*ll    Al  -  /As 
II  HK  "  H»||   1/A^-/a7 


where  at >  Ac  a^e  the  largest  and  smallest  eigenvalues  of  the 
Hessian  matrix  of  the  performance  function  J. 

Depending  on  the  approach  to  design,  the  adaptive 
filter  for  nonlinear,  nonquadratic  performance  function, 
different  rates  of  convergence  can  be  obtained.   Some 
approaches  exhibit  quadratic  convergence  (those  which  approx- 
imate the  performance  function  by  a  Taylor  series  expansion). 
Others  exhibit  superlinear  convergence. 

Theorem  2.10 

Let  the  performance  function  J  be  defined  by  (2.10) 
and  the  adaptive  filter  be  designed  using  the  conjugate  grad- 
ient method,  then  the  sequence  of  adaptive  filters  {H„}  con- 
verges in  no  more  than  n  steps  to  the  unique  minimum  H*  of 
the  performance  function  J. 

Proof 

The  proof  is  based  on  the  fact  that  both  methods, 
CGF  and  CGP,  are  based  on  the  conjugate  direction  search 
method  which  implies  that  the  step  direction  vector  V„   is 
orthogonal  to  the  gradient  of  the  performance  function  J  at 
iteration  step  K+  1.   This  fact  is  stated  as:  [  55,  61  ] 

GKT+1.VK  =  0  (2.146) 
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The  adaptation  equation  is: 

Its  expression  at  the  iteration  step  n  can  be  related  to  all 
steps  from  iteration  step  K  by: 

n-1 

H   =  H„  -  +  I      ex.  V.  (2.148) 

"n   "K+1    i=K+l^  "J 

for  any   0  <_  K  <_  n  -  1. 

The  gradient  of  the  performance  function  J  at  itera- 
tion step  n  is  given  by: 

£n  ■  2(RXX  Hn  -  Rxs)  (2.149) 

By  substituting  (2.148)  in  •(2.149),  we  get: 

n"1    R   v 
Gn  =  G„+1  +   Z   a-  -XX  -j  (2.150) 

n    K  i    j=K+lJ 
Using  equation  (2.146)  in  (2.150),  we  obtain: 

VTGn  =   E   VKTR   V.  (2.151) 

The  method  of  conjugate  gradient  is  based  on  generating  a 
conjugate  sequence  of  step  direction  vector  {V^} . 
The  conjugacy  condition  satisfies: 

V J  RYY  V,  =  0     f or  K  f   j  .  (2.152) 

— J\  ~  AA  — J 

We  use  (2.152)  in  (2.151)  to  show  that: 

vJ  G„  =  0  (2.153) 

— j\  — n 
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The  step  direction  vectors  V  ,  V, , . . .  V   ,  form  a 

complete  conjugate  basis.   Therefore,  at  iteration  step  n, 

the  only  G  which  satisfies  (2.153)  is  G   =  0^ 

~n  ~~n  (2.154) 

But  for  the  quadratic  performance  function,  the  gradient 
vanishes  only  at  the  minimum.   So  we  proved  that  the  method 
converges  to  the  minimum  of  J  in  (2.10)  in  no  more  than  n 
steps . 

Substituting  (2.154)  in  (2.149),  we  obtain 

5XX  ^n  "  ?XS  =  °  (2'155) 

Sa-R^RXS  (2.156) 

So 

H   =  H*  (2.157) 

— n   — 

Thus  the  filter  converges  to  the  unique  minimum  of  J. 

Q.E.D. 

In  practical  applications,  it  was  found  that  the 
conjugate  gradient  methods  converge  sometimes  in  more  than 
n  steps.   The  reason  is  the  round-off  error.   The  two  condi- 
tions (2.146)  and  (2.152)  are  not  satisfied,  so  the  sequence 
{Vt,}  of  step  directions  does  not  form  a  complete  basis  in  n 
iteration  steps. 

For  the  MSNR  cases,  the  adaptive  filter  could  not 
converge  as  fast  as  in  the  mMSE  cases  because  the  performance 
function  J  in  the  MSNR  is  nonquadratic. 
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4.  The  DFP  Method 

The  variable  metric  DFP  method  exhibits  quadratic 
termination,  apart  from  rounding  errors,  for  the  mMSE  per- 
formance function, 

Fletcher  and  Powell  [58]  proved  for  a  general  per- 
formance function  J  that  a  positive  definite  variable  metric 
Av   implies  a  positive  definite  A^.  ■,  ,  updated  by  (2.77).   They 
showed  that  for  a  quadratic  function  like  the  mMSE  type, 
successive  filter  updates  AHQ,  AH,  ...  AH ,   form  a  set  of 

conjugate  directions,  and  A  ,  =  Rvv  ,  so  the  DFP  algorithm 
■*  °  ~n-l   ~aa  ° 

exhibits  quadratic  termination  in  n  steps. 

The  MSNR  performance  function  is  nonquadratic  and 
nonlinear,  so  the  DFP  method  cannot  exhibit  quadratic  ter- 
mination.  But  according  to  our  test  results,  it  is  still  a 
fast  convergence  technique.   If  the  method  converges  slowly, 
it  is  recommended  to  restart  the  variable  metric  every  n+1 
steps  by  setting  A,^,  =  I ,  to  overcome  round-off  errors. 

5.  The  AT  Adaptive  Filter 

The  Amir  transform  adaptive  filter  exhibits  very 
fast  convergence  speed.  The  reason  lies  in  the  way  it  was 
designed.  Each  iterative  step  uses  a  transform  to  satisfy 
the  generalized  eigenvalue  and  eigenvector  steady  state  equation. 

Theorem  2.11 

Let  the  adaptive  filter  be  updated  by  (2.111)  and  the 
performance  function  defined  by  (2.25).   Then  the  filter  H_K 
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converges  to  the  unique  optimal  filter  H*,  if  the  adaptation 
gain  aK  satisfies  condition  (2,90), 

The  adaptive  filter  HK  is  updated  by  (2.111) 

Substituting  (2.83)  for  G„  in  (2.111),  and  defining 


we  obtain 


1    -1  2aK 

^K+l  =  J^  ?NN  ?SS  Mk  +  —  '  (?SS  "  JK  ?NN}  -K 


Rearranging  (2.159)  gives 


(2.159) 


1    -1  K^K     ~1   1    -1 

-K+l  =  [J^  ?NN  ?SS  +  ~3^~~   ?NN  (J^  ?NN  ?SS  ~  ~)]  '  -K 


(2.160) 


Subtracting  HK  from  both  sides,  we  obtain: 


K  K     ■ 1   1    "1 
£r+1  "  ^K  =  (J  +  ~3^"  "  ?NN)(J^  ?NN  ?SS  "  P  ^K    (2.161) 

I  is  the  identity  matrix. 

Let  us  define  the  matrix  Zv   as: 

7      -    t  +  2  KJK     -1  (2.162) 

Since  Rxtvr>  Rcc  are  positive  definite  and  a„,  J^,  3„ 
are  positive  numbers,  thus  Z„  is  positive  definite.   Since 
a„,  J  ,  $      are  bounded,  the  norm  of  the  matrix  Z„  is  bounded. 
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In  other  words,  there  exists  a  positive  number   such  that: 

II  ZK  ||  <  A  (2.163) 

Taking  the  norm  of  (2.161)  and  using  the  inequality, 

II  A  •  B  ||  <  ||  A  H-ll  ?  ||  (2.164) 

where  A,  B  are  matrices,  and  combining  with  (2.163), 
we  obtain: 

l|HK+1-HKll  <  »  '  II  7^   ■    ?NN  ?SS  -  I  INI  HK  II 


which  turns  out  to  be 


The  largest  eigenvalue  of  j—   RNN  R^e  -  I   is 


1_  n~l 

K 


But 


(2.165) 


I  -K+l  "  -K  H  „i  -1 

||H*||  ~X    '   "^   •    ?NN   5SS  -  i  II  (2-166) 


If   the   convergence   error   e~   is   defined   as 

A       II  Hk+1  "  -K  'I 
£K   =  ||  HK  || (2.167) 

V    £K±   X    '     Hj7?NN   ?XX    "    I  H  t2-168) 

K 


J 

-4^-  -  1  (2.169) 

JK 

where  Jmax  is  the  largest  eigenvalue  of  R^N  R^. 


7^   •    ?NN  ?SS  "  I   II  *  "-if  -    1  -        t2-17°) 
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So 

J 
eK  £  A  •  (-J™*  .  !)  (2.171) 

K 

The  adaptation  gain  aK   is  designed  to  satisfy 
condition  (2.90)  which  states  that: 

JK+1  *   JK  »  ° 

Updating  (2.171)  and  using  condition  (2.90),  we  have: 

J 
K+1         JK+1 

j 

II.         e»  <  A   •   (   "1M-.  i) 

*  JK  (2.172) 

HI-        JK^JK+1 

Thus,  the  sequence  {e^-}  converges  to  zero,  because 

J    is  the  maximum  value  of  the  unimodal  performance 
max  r 

function,  and  the  sequence  {J^}  is  an  ascending  sequence 

bounded  by  the  upper  bound  J    , 
7      rr  max* 

so 

J  J 

lim   e     =    A    •    lim    ( -JSS3L    -    i)    =   \    .    (    Tmax    -    1) 
■*«       K  K=«  JK  Jmax 

=    0  (2.173) 

This  proves  that  the  filter  converges.   At  the 
convergence  point,  C2.170)  satisfies 

II  J"   ?NN  ?SS  -  J  II  =  °  (2-174) 

CO 
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or,  in  the  vector  form 

oo 

(2.175)  is  the  equation  for  the  stationary  points  of  the 
performance  function  J.   Thus,  J  =  J    and  correspondingly 

r  >   oo     max  r       o  / 

H  =  H*. 

— oo     — 

So  the  adaptive  filter  converges  to  the  unique  optimum. 

(Q.E.D.) 

F.   PRESENTATION  OF  RESULTS 

1.   Organization  of  Results 

The  performances  of  both  mMSE  and  MSNR  nonrecursive 
adaptive  spatial  filters  have  been  extensively  evaluated  on 
two  real  world  infrared  images,  shown  in  Fig.  2.1  and  2.2. 
Before  the  detailed  presentation  of  these  results,  a  detailed 
description  is  given  of  how  the  evaluations  are  organized. 

(a)  Filter  type: 

-  Nonrecursive  adaptive  spatial  filter 

-  Search  box  (filter  size)  3  by  3  pixels  with 

the  estimation  pixel  in  the  middle  of  the  filter 

(b)  Optimization  criterion  and  performance  function: 

-  mMSE:   Minimization  of  mean  square  error 

-  MSNR:   Maximization  of  signal  to  noise  ratio 

(c)  Adaptation  equation: 
1.   LMS  approach: 

-K+l  =  -K  +  2ye- 
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Fig.  2.1  A  9  level 
computer  print  of 
Indiana  infrared 
test  image. 
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Fig.  2.2   A  9  level 
computer  print  of 
China  Lake  infrared 
test  image. 
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2.  Gradient  approaches: 

^K+l  =  ^K  +  aK  ^K 

3.  Conjugate  gradient  approaches: 

4.  Variable  metric  approach: 

Hk-M  =  »K  *  aK  ^K 

5.  Amir's  transform  approach: 

£k+i  =  ^k  +  aK  ^k 

(d)  Search  methods: 

1.  LMS  approach: 

Steepest  descent  method 

2.  Gradient  approaches: 

Steepest  descent  method 
Accelerated  steepest  descent  method 
Amir's  method  (apply  only  to  mMSE  case) 

3.  Conjugate  gradient  approaches: 

Fletcher-Reeves  method 
Pollack  method 

4.  Variable  metric  approaches: 

Davidon-Fletcher-Powell  method 

5.  Amir's  transform  approach: 

Apply  only  to  MSNR  case 

(e)  Test  images  used: 

1.   Indiana  image  (Fig.  2.1): 

32  x  32  pixels 

Blue  spike  infrared  spectral  band 
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An  image  taken  from  a  city  in  Indiana 
and  used  quite  extensively  as  a  standard 
test  image  for  high  altitude  downward 
looking  infrared  surveillance  system. 

2.   China  Lake  image  (Tig*  2,2): 

32  x  32  pixels 

Thermal  infrared  band  in  10-13 u  range 

An  image  taken  from  a  desert  area  in 
China  Lake,  California  with  a  highway 
in  the  picture.   It  has  been  used  as  one 
of  the  standard  test  images  for  short 
•  distance  side  looking  infrared  target 
acquisition  system. 

(f)   Performance  evaluation: 

The  performance  of  the  adaptive  filters  is  presented 
in  four  different  ways,  all  as  a  function  of  the 
number  of  iterative  steps  N. 

1.  Filter  coefficients  as  a  function  of  N. 

(9  coefficients  for  a  3  x  3  spatial  filter) 

2.  Output  variance  as  a  function  of  N. 

3.  Processing  gain  as  a  function  of  N.   The 
processing  gain  is  defined  as  follows: 


PG  =  10  log 


where  m-,  mfi  =  means  of  the  input  and 

filtered  images  respectively. 

2    2 
a •  ,  a  q   =  variances  of  the  input  and 

filtered  images  respectively. 

4.   Output  signal  to  noise  ratio  (used  only  in 
MSNR  cases) : 

Output  SNR  of  the  filtered  image  is  defined 
as  follows: 


2 
m  - 

l 

+  a.2 

1 

m02 

♦V 
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HT  RQQ  H 
SNRn 


fiT  *NN  5 


where  H  -  the  filter  vector 

Rc~  «  target  signal  correlation  matrix 
EU„,  *  clutter  noise  correlation  matrix 

2 .   Results  of  mMSE  Adaptive  Spatial  Filters 
I  -  Indiana  Image 

The  test  results  of  adaptive  filters  based  on  the 
mMSE  criterion  and  using  Indiana  test  image  are  presented 
in  the  following  figures: 

Fig.  2.3  -  LMS  approach,  steepest  descent  method 

Fig.  2.4  -  Gradient  approach,  steepest  descent  method 

Fig.  2.5  -  Gradient  approach,  accelerated  steepest 
descent  method 

Fig.  2.6  -  Gradient  approach,  Amir's  method 

Fig.  2.7  -  Conjugate  gradient  approach,  Fletcher-Reeves 
method 

Fig.  2.8  -  Conjugate  gradient  approach,  Pollack 
method 

Fig.  2.9  -  Variable  metric  approach,  Davidon-Fletcher- 
Powell  method. 

In  each  figure  three  results  -  the  nine  filter 
coefficients,  output  variance  and  processing  gain  -  are 
presented  as  a  function  of  iteration  steps. 
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LMS  Algorithm 
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Steepest  Descent  Method  -  mMSE 
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Accelerated  Steepest  Descent  -  mMSE 


Fig.    2 . 5a 
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Amir's  Method  -  mMSE 
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Fletcher-Reeves  Method  -  mMSE 
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Pollack  Method  -  mMSE 
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Davidon-Fletcher-Powell  Method  -  mMSE 
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The  following  additional  numerical  results  are 
presented  in  Table  II-l: 

-  Processing  gain 

-  Mean  of  the  filtered  image 

-  Variance  of  the  filtered  image 

-  Number  of  iteration  steps  to  go  below  the  prescribed 
error 

-  Actual  adaptation  error  when  the  adaptation  is 
stopped. 

a.   Discussion 

These  results  will  be  discussed  in  several  groups. 

(1)   LMS  Approach  and  Steepest  Descent  Method. 

This  approach  is  the  two  dimensional  extension  of  the  most 

widely  used  adaptive  filter  technique.   In  Fig.  2.3,  we  can 

see  that  as  the  adaptation  took  close  to  one  thousand  steps 

to  reach  the  minimum  of  the  output  variance  and  the  maximum 

of  the  processing  gain.   However,  the  adaptation  never 

achieved  a  steady  state,  even  up  to  10,000  steps  of  iteration. 

Further,  there  is  a  steady  state  deviation 
from  the  optimum  output  variance.   It  is  known  as  the  "mis- 
adjustment"  which  commonly  occurs  in  the  traditional  adaptive 
filter  approach  (2  3)  . 

We  believe  these  problems  are  the  consequences 
of  the  basic  assumptions  of  this  LMS  algorithm.   The  reasons 
probably  are  not  obvious  if  we  follow  the  traditional  adaptive 

concept  which  was  initiated  by  Prof.  Widrow  using  the  error 

T 
signal  concept  in  control,  e  =  H  X  -  d.   The  filtered  output 
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T 
H  X  is  compared  with  a  desirable  result,  d.   Their  difference, 

e,  is  used  together  with  a  constant,  but  adjustable,  adaptive 

gain,  2u  ,  to  form  a  correction  term,  AH,  for  the  filter 

coefficients  to  approach  the  optimization  goal,  which  is  the 

minimization  of  mean  square  error. 

On  the  other  hand,  if  we  consider  the 

adaptation  procedure  as  an  optimization  process,  then,  the 

adaptation  equation  takes  the  form  of 

HK+1  =  HK  ♦  AHK  =  HK  +  aK  GK 

where  G„   is  called  the  "gradient,"  a„  is  called  the  "step 
size."   The  concept  of  gradient  means  the  gradient  of  the 
performance  function  surface,  J.   The  product  of  adaptation 
step  size  a^  and  the  gradient  G„  is  the  correction  term  AH.* 

It  is  postulated  that  the  assumptions  made 

by  the  LMS  approach  are  not  directly  responsive  to  the  goal 

T 
of  adaptation  because  the  error  term  H  X  -  d  is  not  directly 

related  to  the  minimization  of  the  performance  function. 

Further,  the  assumption  that  the  adaptive  gain  2y ,  which 

corresponds  to  the  concept  of  step  size  in  optimization,  is 

constant,  does  not  coincide  with  the  fact  that  the  iterative 

steps  toward  optimization  usually  take  place  in  varying  step 

sizes.   These  problems  contributed  to  the  slow  convergence, 

and  the  steady  state  misadjustment  in  the  LSM  adaptive 

spatial  filters. 
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We  developed  several  adaptive  filters  using 
gradient  methods  developed  in  the  optimization  field.  Their 
results  are  discussed  in  the  following. 

(2)  Gradient  Approaches.   Three  different  methods 
were  developed.   Their  results  are  shown  in  Figures  2.4,  2.5, 
and  2.6  for  the  steepest  descent  (SD) ,  accelerated  steepest 
descent  (ASD)  and  Amir's  (AMM)  methods  respectively. 

The  reasoning  described  above  is  quite 
convincingly  supported  by  the  following  observations: 

a.  The  convergence  of  adaptation  is  faster.   It  took  541, 

445,  and  220  steps  for  the  SD,  ASD  and  AMM  methods 
to  reach  the  stopping  condition  of  adaptation  less 
than  1.5  x  10"11  as  shown  in  Table  II-l. 

b.  The   adaptation  procedure  indeed  reached  steady  state 
once  the  adaptation  error  is  less  than  the  stopping 
condition. 

c.  The  steady  state  error  is  smaller  than  that  of  the  LMS 
algorithm  as  shown  in  Table  II-l.  In  fact,  the  output 
variance  is  equal  to  that  of  the  optimum  filter. 

(3)  Conjugate  Gradient  Approaches.   Two  differ- 
ent methods  were  developed.   Their  results  are  shown  in 
Figures  2.7  and  2.8  for  the  Fletcher-Reeves  (CGF)  and  the 
Pollack  (CGP)  methods  respectively. 

Again,  the  improvements  are  clearly  seen. 
In  fact,  they  are  even  better  than  the  gradient  methods.   The 
convergence  took  only  66  and  10  steps  for  CGF  and  CGP  methods 
to  reach  below  the  stopping  condition  of  1.5  x  10~   .   At  the 
same  time,  the  output  variance  is  the  same. 

(4)  Variable  Metric  Approach.   Results  of  this 
approach,  which  is  extended  from  the  one  dimensional  work  of 
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Davidon-Fletcher-Powell  are  shown  in  Fig.  2.9.   Again,  the 

improvements  are  clearly  seen.   The  background  suppression 

result  is  the  same  measured  by  the  output  variance  and 

processing  gain.   But  the  convergence  speed  is  even  better 

and  took  only  9  iteration  steps  to  reach  below  the  stopping 

condition. 

3.   Results  of  mMSE  Adaptive  Spatial  Filter  II  - 
China  Lake  Images 

The  test  results  of  adaptive  filters  based  on  the  mMSE 

criterion  and  using  the  China  Lake  test  image  are  presented 

in  the  following  figures: 

Fig.  2.10  -  LMS  approach,  steepest  descent  method 

Fig.  2.11  -  Gradient  approach,  steepest  descent  method 

Fig.  2.12  -  Gradient  approach,  accelerated  steepest 
descent  method 

Fig.  2.13  -  Gradient  approach,  Amir's  method 

Fig.  2.14  -  Conjugate  gradient  approach,  Fletcher- 
Reeves  method 

Fig.  2.15  -  Conjugate  gradient  approach,  Pollack  method 

Fig.  2.16  -  Variable  metric  approach,  Davidon-Fletcher- 
Powell  method 

In  each  figure,  three  results  are  presented  as 
functions  of  iteration  steps:   filter  coefficients,  output 
variance  and  processing  gain. 

Further,  additional  results  are  summarized  and  pre- 
sented in  Table  I I- 2 : 

-  Processing  gain 

-  Mean  of  the  filtered  image 
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-  Variance  of  the  filtered  image 

-  Number  of  iteration  steps  to  go  below  the 
prescribed  stopping  error 

-  Actual  adaptation  error  when  the  adaptation 
is  stopped. 

a.   Discussion 

The  results,  using  the  China  Lake  image,  are 
generally  similar  to  that  using  the  Indiana  image.  Only 
the  important  features  will  be  summarized  below. 

(1)  LMS  Approaches.  The  adaptation  based  on 
the  LMS  approach  again  show  three  problems:  slow  conver- 
gence, never  reached  steady  state,  and  misadjustment . 

(2)  New  Approaches  Developed  in  this  Thesis. 
All  new  approaches  achieve  the  same  steady  state  performance 
equal  to  that  of  the  optimum  filter  as  shown  in  Table  II. 2: 

Mean  of  the  filtered  image =  6.495  «10~ 

Variance  of  the  filtered  image =  1.2  •  10 

However,  they  converge  to  the  steady  state  value  with  much 
less  numbers  of  steps,  as  shown  in  Table  II. 2  also. 

Therefore,  test  results  on  the  China  Lake 
image  again  demonstrated  the  improvements  in  adaptive  fil- 
ters using  the  approaches  suggested  in  this  thesis. 

It  is  interesting  to  note  that  the  effec- 
tiveness of  background  clutter  suppression  in  the  case  of 
the  China  Lake  image  are  not  as  good  as  that  in  the  case  of 
the  Indiana  image.   For  example,  the  processing  gain  for 
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the  China  Lake  image  is  (19.32)  db  compared  with  (29,874)  db 
for  the  Indiana  image.  We  believe  this  difference  is  related 
to  the  spatial  correlation  of  the  image.  The  higher  the 
correlation,  the  better  is  the  background  clutter  suppression 
The  Indiana  image  is  more  spatially  correlated  than  the  China 
Lake  image. 
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LMS  Algorithm 
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Steepest  Descent  -  mMSE 
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Fletcher-Reeves  Method  -  mMSE 
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Davidon-Fletcher-Powell   Method    -    mMSE 
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4.   Results  of  MSNR  Adaptive  Spatial  Filter  I  - 
Indiana  Image 

The  test  results  of  MSNR  adaptive  spatial  filters 

using  Indiana  test  image  are  presented  in  the  following 

figures. 

Fig.  2.17  -  Gradient  approach,  steepest  descent  method 

Fig.  2.18  -  Gradient  approach,  accelerated  steepest 
descent  approach 

Fig.  2.19  -  Conjugate  gradient  approach,  Fletcher-Reeves 
method 

Fig.  2.20  -  Conjugate  gradient  approach,  Pollack  method 

Fig.  2.21  -  Variable  metric  approach,  Davidon-Fletcher- 
Powell  method 

Fig.  2.22  -  Amir's  transform  approach. 

In  each  figure,  four  results  are  presented  as  func- 
tions of  iteration  steps:  filter  coefficients,  output  var- 
iance, processing  gain  and  output  signal  to  noise  ratio. 

Further,  additional  numerical  results  are  summarized 
and  presented  in  Table  II-3. 
Output  signal  to  noise  ratio 
Processing  gain 
Mean  of  filtered  image 
Variance  of  filtered  image 

Number  of  iteration  steps  to  reach  below 
the  prescribed  stopping  error 

Actual  adaptation  error. 

Discussion: 

a.   In  the  mMSE  adaptive  filter  study,  we  first 

presented  the  results  of  adaptive  filter  design  by  the  LMS 
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algorithm  because  it  is  the  most  frequently  used  method.   We 
extended  it  to  two  dimensions  and  used  it  as  a  benchmark  for 
comparison.   For  the  MSNR  criterion,  we  have  not  yet  found 
any  past  study  of  adaptive  filter  using  this  method.   There- 
fore, comparisons  of  convergence  results  are  based  on  several 
methods  developed  in  this  thesis  study. 

b.  However,  we  can  compare  the  background  clutter 
suppression  results  -  of  the  mMSE  and  MSNR  adaptive  filters. 
For  point  targets,  their  steady  state  filter  coefficients  are 
the  same  if  the  coefficient  of  the  estimation  pixel  are  all 
normalized  to  unity.   Therefore,  the  statistical  properties 
of  the  filtered  image  are  the  same,  i.e.,  the  error  variance 
and  the  mean  of  the  image  after  processing  by  the  two  types 
of  filters  are  identical.   For  the  Indiana  image,  the  mean 
and  variance  of  the  unfiltered  and  filtered  images  are. 

Before  filtering    After  filtering 
mean  3.30397  0.00006495 

variance       0.74111  0.012 

c.  The  convergence  speeds  are  different,  as  shown 
in  Table  II. 3.   For  a  stopping  condition  of  10    ,  the  num- 
bers of  iteration  steps  to  reach  below  this  condition  are: 

SD   *  739      CGF  =  76      DFP  =  25 
ASD  =  739      CGP  =76      AT   =   2 
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Fig.  2.17a   Steepest  Descent  Method  -  MSNR 
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Fig.  2.17c   Steepest  Descent  Method  -  MSNR 
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Fig.  2.18a  Accelerated  Steepest  Descent  -  MSNR 
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Fig.  2.18b  Accelerated  Steepest  Descent  -  MSNR 
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Fig.  2.18c  Accelerated  Steepest  Descent  -  MSNR 
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Fig.    2.18d     Accelerated   Steepest   Descent    -   MSNR 
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Fig.  2.19a   Fletcher-Reeves  Method  -  MSNR 
Filter  Vector 
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Fig.  2.19c   Fletcher-Reeves  Method  -  MSNR 
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Fig.  2.19d  Fletcher-Reeves  Method  -  MSNR 
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Fig.  2.20a   Pollack  Method  -  MSNR 
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Fig.  2.20c   Pollack  Method  -  MSNR 
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Fig.    2.20d      Pollack  Method    -    MSNR 
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Fig.  2.21a  Davidon-Fletcher-Powell  Method  -  MSNR 


OJ  ut 


a  \ 


Li 


o 


ro 


.05 


ITERRTIOM    * 
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Fig.  2.21c   Davidon-Fletcher-Powell  Method  -  MSNR 
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F ig.  2.22a  Amir ' s  Transform  Method  -  MSNR 
Filter  Vector 
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Fig.  2.22b  Amir's  Transform  Method  -  MSNR 
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Fig.  2.22c  Amir's  Transform  Method  -  MSNR 
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The  same  trend  in  mMSE  cases  is  found  for  the 
MSNR  cases.   The  variable  metric  method  (DFP)  is  faster  than 
the  conjugate  gradient  methods  (CGF,  CGP)  which  are  faster 
than  the  gradient  methods  (SD,  ASD)  , 

It  is  important  to  point  out  that  the  transform 

method  (AT)  which  does  not  have  a  corresponding  method  in 

the  mMSE  cases  has  the  fastest  convergence  speed.   It  took 

only  two  steps  compared  with  the  twenty-five  steps  required 

for  the  variable  metric  method  to  reach  below  the  stopping 

condition. 

5.   Results  of  MSNR  Adaptive  Spatial  Filters 
II  -  China  Lake  Image 

Test  results  of  MSNR  adaptive  spatial  filters  using 

the  China  Lake  test  image  are  presented  in  the  following 

figures : 

Fig.  2.23  -  Gradient  approach,  steepest  descent 
method 

Fig.  2.24  -  Gradient  approach,  accelerated  steepest 
descent  method 

Fig.  2.25  -  Conjugate  gradient  approach,  Fletcher- 
Reeves  method 

Fig.  2.26  -  Conjugate  gradient  approach,  Pollack 
method 

Fig.  2.27  -  Variable  metric  approach,  Davidon- 
Fletcher-Powell  method 

Fig.  2.28  -  Amir's  transform  method. 

Several  numerical  results  are  presented  in  Table  II. 4 

Output  signal  to  noise  ratio 
Processing  gain 
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Mean  of  filtered  image 

Variance  of  filtered  image 

Number  of  iteration  steps  to  reach  below 
the  prescribed  stopping  error 

Actual  adaptation  error. 

Discussion 

a.  Gradient  approaches  have  not  been  included  in 
these  presentations  because  their  convergence  speeds  are  not 
as  fast  as  the  conjugate  gradient,  variable  metric  and  Amir's 
transform  methods. 

b.  Again,  the  Amir  transform  method  has  the  fastest 
convergence  speed.   It  only  took  three  steps  to  reach  below 
the  stopping  condition  compared  with  fifteen  steps  required 
by  the  next  fastest  method,  the  variable  metric  method. 

c.  Based  on  the  experience  using  the  Indiana  image 
and  the  China  Lake  image,  the  comparative  behaviors  among 
these  methods  are  similar. 
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Fig.  2.23a   Steepest  Descent  Method  -  MSNR 
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Fig.  2.23b   Steepest  Descent  Method  -  MSNR 
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Fig.  2.23c   Steepest  Descent  Method  -  MSNR 
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Fig.  2.24a  Accelerated  Steepest  Descent  Method  -  MSMR 
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Fig.  2.24b  Accelerated  Steepest  Descent  Method  -  MSNR 
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Fig.    2.24c     Accelerated   Steepest.  Descent  Method    -   MSJsjr 


C3.  V 

Processing   Gain    [db] 

20.0 

• 

15.0 

:/ 

10.0 

t 

5.0 
Pi.  pi 

1 ■ 1 1 1 , , *_ 

e      6     12     18     24     30     36     42     48     54     Be 


ITERATION  ft 
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Fig.  2.25a   Fletcher-Reeves  Method  -  MSNR 
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Fig.  2.25b   Fletcher-Reeves  Method 
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Fig.  2.25d   Fletcher-Reeves  Method  -  MSNR 
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.  Fig.  2.26a  Pollack  Method  -  MSNR 
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Fig.    2.26b      Pollack  Method    -  MSNR 


120  ■ 


12     18     24     313     36     42     48     54     60 


ITERATION  # 


144 


Fig.    2.26c      Pollack  Method    -    MSNR 
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Fig.  2.26d  Pollack  Method  -  MSNR 
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Fig.  2.27a  Davidon-Fletcher-Powell  Method  -  MSNR 
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Fig.    2.27b      Davidon-Fletcher-Powell   Method    -   MSNR 
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250    Fig-  2.27c   Davidon-Fletcher-Powell  Method  -  MSNR 
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Fig.  2.28a  Amir's  Transform  Method   -   MSNR 
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Fig.  2.28c   Amir's  Transform  Method  -  MSNR 
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III.   THE  MULTIPLE  MICROCOMPUTER  SYSTEM 

A.   INTRODUCTION 
1.   General 

Signal  processing  algorithms  are  usually  developed 
on  main  frame  computers.   The  transfer  of  these  algorithms 
to  on-board  processors  in  practical  systems  is,  in  general, 
not  an  easy  task  because  there  are  many  constraints  in  real 
systems  such  as  the  processing  speed,  weight,  volume,  power, 
fault  tolerance  and  others.   This  thesis  undertook  both  the 
theoretical  development  task  and  the  practical  implementation 
investigation.   Specifically,  this  chapter  will  present  the 
second  part  of  this  thesis  which  considers  the  implementation 
of  adaptive  image  processing  algorithms  developed  in  the  last 
chapter  by  a  multiple  microcomputer  system  using  concurrent 
parallel  and  pipeline  processing. 

It  is  important  to  point  out  that  the  digital  computer 
is  not  the  only  technique  for  real  time  implementation.   De- 
pending on  the  amount  and  rate  of  signal  data;  precision  and 
dynamic  range  requirements;  need  of  programmability  and  sev- 
eral oth  er  factors,  different  approaches  of  signal  formats, 
device  technologies,  signal/data  processor  architectures 
should  be  considered.   In  many  cases,  combinations  of  analog, 
sampled  analog  and  digital  processing  approaches  using  optical, 
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electronic  and  acoustical  devices  probably  will  offer  cost- 
effective  and  optimum  performance  [178-180]  .   However,  with 
the  rapid  advances  of  VLSI/VHSIC  technologies  in  both  in- 
creasing speed  and  decreasing  power,  size  and  cost,  the 
importance  of  digital  electronic  implementation  in  the  form 
of  distributed  processing  using  multiple  processors  has  been 
increasing  at  a  rapid  rate  and  will  undoubtedly  play  a  more 
and  more  important  role  in  real  on-board  implementation. 
This  part  of  the  thesis  is  to  investigate  and  develop  the 
feasibility  and  potential  of  using  multiple  microcomputer 
systems  for  real  implementation. 

2.   Multiple  Processor  Developments 

Multiple  microcomputer  systems  are  a  subset  of  larger 
families  of  multiple  processor  systems  whose  developments  were 
started  over  twenty  years  ago.   It  was  obvious  for  a  long  time 
that  several  processors  are  better  than  one.   However,  how 
should  they  be  connected  together  and  effectively  used  has 
not  been  obvious  at  all.   The  answer  depends  on  many  factors. 
First,  what  is  the  objective?   Is  it  real  time  processing, 
fault  tolerance,  multiple  users,  security,  or  some  combina- 
tions of  these?  Second,  what  are  the  available  technologies 
in  both  hardware  and  software?   Third,  what  are  the  con- 
straints in  cost,  weight,  volume,  development  time,  available 
manpower?   The  answers  have  been  very  different  depending  on 
many  of  these  factors.   We  can  identify  several  major  areas 
of  multiple  processor  developments  since  the  early  1960's. 
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a.  Supercomputers  [151,  152] 

The  first  area  can  be  generally  called  the  "super- 
computers." Several  processors  were  connected  in  different 
ways  to  offer  parallel  processing  [153-155] ,  pipeline  process- 
ing [156-158J  or  combined  parallel/pipeline  processing  capa- 
bilities.  In  some  cases,  specially  designed  signal  processors, 
called  array  processors,  are  connected  to  a  host  computer  to 
offer  very  fast  data  crunching  capabilities.   In  most  of  these 
cases,  the  basic  processing  elements  to  form  the  multiple 
processor  systems  are  special  arithmatic  or  signal  processing 
units,  not  stand-alone  computers.   Their  inter-communications 
and  the  signal  flow  are  usually  fixed  in  the  design  stage  to 
achieve  very  fast  computing  speed  and  are  not  changed  during 
operation.   Several  representative  systems  are  listed  in 
Table  III-l.   Their  common  objective  is  "fast  computation" 
and  "high  throughput."   The  processing  elements  are  tightly 
coupled. 

b.  Computer  Networks  [161,  162] 

The  second  area  can  be  generally  called  the 
"computer  network."   Several  processors  are  connected  to- 
gether for  intercommunication  and  resource  sharing.   The 
basic  processing  elements  are  mainly  stand-a],one  computers. 
A  problem  is  usually  not  partitioned  and  performed  concur- 
rently on  several  processing  elements.   The  system  is,  in 
general,  loosely  coupled.   The  communication  is  carried  out 
by  messages  with  appropriate  synchronization  codes  at  the 

beginning  and  the  ending  of  the  message. 
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c.  Ultra-Reliable  Fault  Tolerant  or  Highly 
Available,  Graceful  Degrading  Computers 
[166,  167] 

The  third  area  can  be  generally  called  "Fault 
Tolerant  or  Highly  Available"  computers.   Multiple  process- 
ing elements  have  been  connected  in  different  ways  to  offer 
either  fail-soft,  fail-safe  or  graceful  degradation  capabil- 
ities.  In  most  fault  tolerant  computers,  the  redundancy 
and/or  sparing  are  usually  made  at  the  building  block  levels, 
such  as  the  CPU,  RAM,  I/O  ports,  etc.  to  make  a  very  reliable 
and  fault  tolerant  single  computer  [168] .   The  intercommuni- 
cations among  the  elements  are  generally  fixed.   In  recent 
years,  because  of  the  steady  decrease  of  the  cost  of  a  com- 
puter, the  basic  processing  elements  in  a  multiple  processor 
system  are  a  small  number  of  stand-alone  computers  [169-171]. 
These  systems  started  a  new  direction  in  the  multiple  processor 
developments  because  the  intercommunications  among  the  process- 
ing elements  are  no  longer  fixed.   The  processing  tasks  can 
be  flexibly  assigned  to  different  processors.   This  dynamic 
assignment,  or  allocation  capability,  also  allows  a  new  system/ 
software  approach  to  fault  tolerance  and  fault  repair. 
3.   Multiple  Microcomputer  System  Developments 

The  rapid  advance  of  low  cost  and  small  microcomputers 
has  extended  the  development  described  above  into  a  new  dimen- 
sion because  a  large  number  of  microcomputers,  instead  of 
just  a  few,  can  conceivably  be  interconnected  into  a  system. 
Not  only  can  its  fault  tolerance  capability  be  further 
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increased,  the  computational  or  signal  processing  capability 
can  also  be  much  enhanced  by  providing  concurrent  parallel 
and  pipeline  processing  capabilities. 

The  beginning  of  the  multiple  minicomputer  system 
development  was  started  at  the  Carnegie  Mellon  University 
in  their  Cmmp  system  [172] .   Although  it  used  PDP-11  mini- 
computers, its  tightly  coupled  architecture  and  dynamic  memory 
allocation  concept  allowed  a  relatively  large  number  of  pro- 
cessing elements  to  join  together  into  a  single  system.   This 
development  was  soon  followed  by  a  tightly  coupled  multiple 
microcomputer  project,  CM*  [173],  also  at  Carnegie  Mellon 
University.   Since  that  time,  several  tightly  coupled  systems 
have  been  proposed  [174  to  183]  .   Some  of  them  have  gone  be- 
yond the  conceptualization  stage  and  started  serious  hardware/ 
software  development  efforts.   However,  none  has  reached  the 
operational  stage  at  this  writing. 

At  the  same  time,  another  direction  of  multiple  micro- 
computer development  has  been  pursued  toward  the  "computer 
network"  objective  [134-188].   These  systems  can  be  distin- 
guished from  the  developments  described  above  in  the  following 
major  aspects: 

0  Different  types  of  processing  elements  are  used. 
In  other  words,  they  are  "heterogeneous." 

°  The  processing  elements  are  loosely  coupled. 

0  The  bandwidth  of  the  intercommunication  buses  is 
relatively  low. 


155 


4.   This  Thesis  Research 

The  second  part  of  this  thesis  research  is  to  develop 
a  multiple  microcomputer  system  and  to  investigate  its  feasi- 
bility in  implementing  real  time  on-board  signal/data  process- 
ing for  a  smart  sensor  system.   It  is  similar  to  a  number  of 
multiple  microcomputer  systems  in  development  in  the  past 
three  to  four  years  which  permit  up  to  16  microcomputers  to 
be  interconnected  in  some  ways  to  perform  computations. 
However,  their  objectives,  architectures,  intercommunication 
concepts,  controllers,  hardware  buses  and  processing  elements, 
software  operating  system,  etc.  are  quite  different. 

This  thesis  project  is  presented  by  highlighting  the 
following  features: 

a.  Its  objectives  are  to  provide  a  multiple  tasking 
system  including  fast  image/signal  processing  capability  and 
other  more  moderate  speed  but  highly  reliable  signal/data 
processing  capability  for  system  management,  command  and 
control. 

b.  Some  of  the  signal/data  processing  tasks  will  be 
performed  by  tightly  coupled  processors.   But  the  processors 
performing  other  tasks  do  not  have  to  be  all  tightly  coupled 
together.   Therefore,  a  mixed  tightly  and  loosely  coupled 
system  is  envisioned. 

c.  A  part  of  the  system  must  perform  some  critical 
tasks  which  require  ultra-reliability.   Other  parts  of  the  sys- 
tem only  require  fail-soft  and  graceful  degradation  performance 

156 


In  any  event,  a  dynamic  allocation  capability  is  required 
which  allows  flexible  assignment  of  microcomputers  to  perform 
various  tasks,  which  provides  some  fault  tolerance. 

For  these  requirements,  a  multiple  star/multiple 
cluster  system  of  16  bit  microcomputers  was  developed.   Its 
general  concept  and  philosophy  was  developed  by  a  top-down 
system  design  procedure  which  will  be  presented  in  the  next 
Section,  III.B.   It  will  be  explained  how  a  choice  was  made 
considering  several  alternatives  and  seven  important  issues 
related  to  the  system.   In  Section  III.C,  detailed  implemen- 
tation of  these  choices  will  be  presented  by  describing  the 
principles  and  circuits  of  this  multiple  microcomputer  system 
in  five  categories: 

System  architecture  .  » 

Processing  resources 
Intercommunication  network 
Intercommunication  procedures 
Multibus  communication. 

The  performance  of  this  system  is  described  in  Section  III.E. 

B.   DESIGN  CONSIDERATIONS  FOR  THIS  MULTIPLE 
MICROCOMPUTER  SYSTEM 

1.   Introduction 

Although  only  two  large  multiple  microcomputer  systems 

and  one  multiple  minicomputer  system  have  appeared  in  the 

literature  and  reached  operational  status,  a  large  number  of 

different  architectures  have  been  proposed  and  some  are  in 

the  process  of  being  implemented.   The  three  operational 
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systems  are  all  from  the  Carnegie-Mellon  University:   CM* 
[172] ,  Cvmp  [191]  and  Cmmp  [173] .   There  are  now  many  options 
for  the  hardware  and  software  design  of  a  multiple  microcom- 
puter system. 

This  thesis  took  a  top-down  system  design  approach 
to  reach  the  choices  made  for  the  design  of  our  system.  This 
design  process  is  presented  in  several  steps  in  this  section 
to  explain  the  general  idea  and  philosophy  of  this  system. 
In  the  next  section,  the  detailed  design  of  various  parts 
will  be  described. 
2 .   Architecture 

This  thesis  is  primarily  concerned  with  the  imple- 
mentation of  adaptive  image  processing.   It  is  important, 
however,  to  realize  that  the  adaptive  filter  is  only  one  part 
of  a  longer  end-to-end  image  processing  program  for  detecting, 
tracking  and  recognizing  targets  in  noisy  images.   The  adap- 
tive spatial  filter  is  used  to  enhance  the  target  signal  to 
noise  ratio  by  suppressing  the  background  clutter  which  may 
be  enhanced  by  additional  image  processing  techniques,  such 
as  the  adaptive  temporal  filters.   The  clutter  suppression 
stage  is  followed  by  thresholding,  target  acquisition, 
recognition  and  tracking  stages.   These  signal  processing 
operations  are  quite  different.   For  example,  adaptive  spa- 
tial filters  require  the  computation  of  statistical  image 
characteristics,  solving  matrix  equations.   Adaptive  threshold- 
ing requires  the  comparison  and  rearrangement  of  real  numbers. 
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Target  acquisition  usually  involves  pattern  tests  of  numbers 
based  on  spatial,  temporal  and/or  spectral  information. 
Therefore,  although  each  individual  signal  processing  stage 
requires  real-time  or  fast  execution  speed,  different  signal 
processing  stages  do  not  depend  on  one  another  during  process- 
ing.  Furthermore,  it  is  important  to  realize  that  processing 
of  target  signals  for  the  mission  objective  is  only  one  part, 
although  a  very  important  part,  of  the  total  signal/data  pro- 
cessing requirements  for  the  whole  system.   There  are 
processing  functions  such  as  management,  control,  communica- 
tion and  others  which  must  also  be  implemented.   The  nature 
and  requirements  of  their  processing  operations  are  quite 
different  and  vary  over  a  wide  range.   Some  do  not  need  high 
processing  speed  but  demand  very  high  reliability.   Others 
do  limited  computation  but  handle  large  amounts  of  data.   In 
general,  the  signal/data  processing  requirements  of  many 
systems  cover  a  wide  range.   Therefore,  we  designed  an  archi- 
tecture which  has  several  levels  of  coupling  among  processing 
elements . 

At  the  first  level,  special  processors  may  be  directly 
coupled  to  a  microcomputer.   At  the  second  higher  level,  sev- 
eral microcomputers  are  connected  to  the  same  system  bus  in 
parallel  and  form  a  "cluster."   A  microcomputer  can  communi- 
cate with  any  other  microcomputer  on  the  same  bus  or  within 
the  same  cluster  directly  through  common  memory.  It  is  a  tight- 
ly coupled,  bus  oriented  multiple  microcomputer  architecture. 
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At  a  higher  level,  the  third  level,  four  clusters  are  con- 
nected by  way  of  a  "complete  star"  bus  switch  network  and 
form  a  "star."  The  communication  of  microcomputers  between 
two  clusters  is  accomplished  by  way  of  the  switch  network. 
Therefore,  they  are  not  as  tightly  coupled  as  microcomputers 
within  a  cluster  because  there  will  be  more  overhead  in 
intercluster  communication  than  intracluster  communication. 
However,  it  was  found  that  using  specially  designed  control- 
lers for  the  intercluster  communication,  the  access  time  was 
increased  by  only  9%.   This  data  is  presented  in  Section  III.E. 
Therefore,  we  can  consider  that  microcomputers  in  different 
clusters  within  the  same  star  are  still  tightly  coupled.   At 
the  next  higher  level,  the  fourth  level,  several  "stars"  are 
connected  together  by  linking  nearest  neighboring  "stars" 
through  a  bus  switch  to  form  a  "lattice  network."   The  inter- 
communication between  microcomputers  from  two  stars  are  sim- 
ilar to  that  within  a  star,  involving  one  central  controller 
and  two  distributed  controllers.   The  overhead  is  practically 
the  same.   Therefore,  from  the  intercommunication  viewpoint, 
microcomputers  from  two  stars,  and  also  throughout  the  systems, 
are  practically  all  tightly  coupled.   However,  through  pro- 
gramming, they  may  be  used  either  in  tight  coupling,  loose 
coupling  or  any  combinations  in  between  to  suit  the  require- 
ments of  the  applications. 
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3.   Intercommunication  and  Control 

Because  of  the  hierarchical  structure  of  the  archi- 
tecture, the  intercommunication  processes  and  their  controls 
are  also  hierarchical  and  are  distributed.   They  are  hier- 
archical because  there  are  three  levels  of  controls  as  shown 
in  Table  III. 1. 

At  the  lowest  level  of  intracluster  communication, 
no  bus  switch  is  needed.   A  Random  Priority  Controller  (RPC) 
is  used  for  arbitration.   Only  a  small  portion  of  the  dis- 
tributed controller  is  used,  mainly  to  check  if  requests 
outside  the  cluster  have  been  granted.   At  the  next  higher 
level  of  intercluster  communication,  the  intrastar  bus  switch 
is  used.   Arbitration  is  accomplished  by  both  distributed 
controller  and  RPC.   Only  a  portion  of  central  controller  is 
used  to  grant  the  intercluster  request.   At  the  highest  level, 
both  interstar  and  intrastar  bus  switches  may  be  used  and  all 
controllers,  central,  distributed  and  random  priority,  are  in 
full  action. 

Further,  the  controllers  are  distributed  because 
there  are  four  identical  RPC  and  distributed  controllers, 
one  in  each  cluster.   Although  there  is  only  one  central  con- 
troller, it  consists  of  four  identical  units,  one  for  each 
cluster.   The  advantages  of  such  a  distributed  control  system 
are:   (1)  Parallel  control  actions  which  enhance  the  speed  of 
"request  arbitration."   (2)  Improved  fault  tolerance  because 
the  control  actions  are  shared  between  four  separate  units 
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and  should  one  malfunction,  the  other  three  can  still  con- 
tinue their  functions. 

4.  Hardware  Implementation  of  Controllers 
Controller  circuits  can  be  implemented  in  several 

ways : 

a.  Microprocessor  control 

b.  Bit  slice  processor  control 

c.  Digital  logic  circuit  control. 

Two  performance  characteristics  should  be  considered  in  their 
choice  and  design:   programmability  and  speed.   The  micropro- 
cessor approach  has  the  most  versatile  programmability  but 
the  slowest  speed.   The  digital  hardware  approach  has  very 
limited  programmability  but  the  fastest  speed  of  the  three. 
An  estimate  has  been  made  to  compare  their  speeds. 

In  our  design,  the  primary  goal  is  to  offer  the  fast- 
est response  and  arbitration  of  requests  and  communication 
speed.   Therefore,  we  chose  the  digital  logic  circuit  approach. 
Great  care  was  given  to  the  design  of  controller  concepts  and 
circuits,  to  avoid  unexpected  changes.   Further,  Schottky  and 
low  power  Schottky  chips  are  used  due  to  their  speed  and  power 
trade-offs.   CMOS  chips  were  found  to  be  too  slow  and  do  not 
have  adequate  driving  capability. 

5.  Priority  Resolver 

There  are  several  ways  to  arbitrate  multiple  requests 
or  to  resolve  priorities: 
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„.   ,       ..      Serial  (Daisy  chain) 
Fixed  priority     Parallei    y 

Rotating  priority 

FIFO 

Random  priority- 
There  are  two  primary  requirements  for  a  priority 
resolver  circuit:   uniform  and  fast  resolution  of  bus  re- 
quests.  In  this  system,  an  Intel  Multibus  is  used  as  the 
system  bus  with  10  MHZ  bus  clock  frequency.   We  decided  to 
design  a  priority  resolver  circuit  to  arbitrate  8  SBCs  within 
one  bus  clock. 

The  fixed  priority  approach  was  not  selected  because 
it  was  unable  to  arbitrate  multiple  bus  requests  and  grant 
their  usages  uniformly.   Test  results  showed  that  in  our 
tightly  coupled  environments,  two  SBCs  are  able  to  share  the 
bus  adequately.   More  than  two  SBCs  produce  unacceptable 
delays. 

Rotating  priority  is  much  faster  than  the  fixed  pri- 
ority approach.   It  is  able  to  arbitrate  multiple  requests 
and  does  grant  their  bus  usages  uniformly.   However,  it  was 
not  our  final  choice  because  the  random  priority  approach  was 
found  to  be  faster.   This  is  because  in  the  rotating  priority 
approach,  every  bus  request  line  is  tested  serially  (in  a 
rotating  manner)  whether  there  are  request  signals  on  these 
lines  or  not.   In  the  worst  case,  the  rotating  priority  re- 
solver grants  the  bus  after  N  searches,  where  N  is  the  number 
of  SBCs  being  arbitrated  by  the  resolver. 
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First  in-first  out  (FIFO)  is  a  resolver  approach 
which  requires  memory.   Because  of  the  time  needed  to  refer- 
ence the  memory,  it  is  not  possible  to  build  a  FIFO  resolver 
to  arbitrate  8  bus  requests  within  100  nsec,  the  bus  clock 
period.   With  current  technology,  a  fast  FIFO  arbiter 
probably  requires  more  than  300  nsec. 

The  random  priority  resolver  is  designed  based  on 
the  binary  tree  synchronous  selector  concept.   Consider  our 
case  of  8  SBCs  in  a  cluster.   Three-stage  selection  is  used. 
During  the  first  stage,  four  out  of  eight  lines  are  checked 
simultaneously.   In  the  second  stage,  two  out  of  these  four 
lines  are  checked  simultaneously  again.   The  final  bus  grant 
is  made  in  the  third  stage.   In  other  words,  the  time  for 
searching  and  resolving  the  bus  requests  is  log-N,  which  is 
faster  than  the  rotating  priority  resolver.   Test  results  have 
shown  that  the  random  priority  resolver  is  able  to  arbitrate 
multiple  bus  requests  and  grant  their  bus  usages  uniformly  as 
demonstrated  in  Fig.  3.17.   Four  SBCs  simultaneously  sharing 
the  bus  in  a  tightly  coupled  environment  are  taken  for  the 
test  case.   These  four  SBCs  were  programmed  to  request  the 
bus  usage  almost  100%  of  the  time.   The  BPRN  signals  of  these 
four  SBCs  are  shown.   A  low  signal  of  BPRN  indicates  that  its 
SBC  is  using  the  system  bus.   The  fact  that  none  of  these 
four  traces  showed  any  long  periods  of  bus  usage  or  bus  wait- 
ing demonstrated  that  the  random  priority  resolver  is  able  to 
arbitrate  very  heavy  bus  requests  by  these  four  SBCs  and  grant 
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bus  usage  to  them  "uniformly."   It  was  found  that,  on  the 
average,  a  "bus  request"  is  granted  in  about  60  nsec. 
6.   Bus  Switches 

Bus  switches  are  one  of  the  most  important  parts  of 
a  multiple  microcomputer  system  because  they  provide  the  in- 
terconnection means  among  the  processing  resources.   There 
are  two  aspects  of  the  "bus  switch"  problem:   bus  switch 
network  and  the  individual  bus  switch  link. 

Many  switch  networks  have  been  investigated,  some 
predated  the  computer  developments  [195] .   A  small  number 
of  them  have  been  considered  in  the  multiple  microcomputer 
development:   cross-bar,  banyan,  hyperconcentrator ,  simple 
ring,  etc. 

A  combined  approach  was  selected  including  two  levels 
of  switching  networks  because  of  the  consideration  of  multi- 
task signal/data  processing  requirements  in  a  typical  system. 
At  the  higher  level,  many  stars  are  interconnected  in  a 
lattice  architecture.   Interstar  bus  switches  are  provided 
between  neighboring  nodes.   At  the  lower  level,  four  clusters 
are  included  in  each  "star"  node.   They  are  interconnected  by 
a  "complete  star  bus  switch"  network.   The  complete  star 
switch  is  chosen  for  two  reasons.   First,  the  coupling  within 
a  star  should  be  as  tight  as  possible.   The  complete  star 
switch  allows  us  to  connect  two  clusters  by  the  shortest  link. 
Second,  if  a  link  failed,  the  complete  switch  gives  us  two 
choices  to  connect  two  clusters  by  way  of  a  third  cluster  via 

two  links,  thus  providing  some  fault  tolerance. 
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The  important  part  of  the  individual  bus  switch  link 

is  the  switches  themselves.   For  the  Intel  Multibus,  we  found 

that  58  of  the  86  lines  should  be  switched.   There  are  several 

choices  for  the  switches : 

Bidirectional:   MOS  types  of  switches,  such  as  CMOS,  VMOS 
and  DMOS. 

Unidirectional:   Bipolar  types  such  as  Schottky,  low 
power  Schottky  and  ECL  types;  Optoelectronic  types. 

Optoelectronic  types  of  switches  were  not  chosen  because 
they  are  slow,  on  the  order  of  10  usee.   Very  fast  switching 
speeds  on  the  order  of  several  tens  of  nanosec  are  required 
because  today's  Multibus  is  running  at  10  MHZ  which  corre- 
sponds to  a  clock  period  of  only  100  nsec.   CMOS,  VMOS  and 
DMOS  switches  could  provide  such  switching  speeds.   However, 
they  do  not  have  enough  driving  capabilities  for  the  15  ma 
or  more  required  by  many  of  the  control  and  address  signals 
of  the  microcomputer.   Therefore,  these  MOS  switches  were  not 
chosen,  although  their  bidirectional  feature  and  the  low  power 
characteristics  of  the  CMOS  switches  are  extremely  attractive 
and  reliable.   We  chose  the  low  power  Schottky  switches  be- 
cause of  their  speed  and  driving  capability.   A  typical  per- 
formance is  shown  in  Fig.  3.15  which  shows  the  waveform  of  an 
address  signal  before  and  after  the  switch.   It  can  be  seen 
that  not  only  is  the  delay  short,  on  the  order  of  25  nsec, 
but  also  the  waveform  is  improved  by  the  switch  because  of 
its  good  driving  capability  of  up  to  50  ma.   It  was  tested 
with  a  minimum  load  resistor  of  50  ohms  and  maximum  capacities 
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of  270  pf  and  the  switch  continued  to  function  satisfactorily 
up  to  45  MHZ,   One  disadvantage  is  the  need  to  use  two  back- 
to-back  switch  circuits  for  a  bidirectional  switching  of  each 
signal.   Therefore,  a  special  circuit  was  designed  to  provide 
not  only  the  "enable"  signal  but  also  the  "direction." 
7.   Processing  Elements 

There  are  two  major  types  of  processing  elements  on 
the  system  bus:   general  purpose  microcomputers  and  special 
purpose  processors  which  can  further  be  separated  into  two 
subcategories.   One  is  a  special  purpose  processor  like  an 
array  processor  which  can  perform  several  signal  processing 
operations  such  as  fast  Fourier  transform,  correlation,  convo- 
lution, finite  impulse  filtering,  infinite  impulse  filtering, 
etc.   The  second  type  is  a  special  purpose  processor  which  is 
designed  to  perform  only  one  signal  processing  operation  such 
as  FFT. 

a.  General  Purpose  Microcomputer 

It  was  decided  that  all  general  purpose  microcom- 
puters used  in  our  system  should  be  treated  homogeneously. 
This  is  necessary  because  two  major  principles  of  our  operat- 
ing system  are  based  on  the  "virtual  processor"  [189]  and 
"dynamic  process  allocation"  [190]  concepts  which  require 
homogeneous  processing  elements. 

b.  Special  Purpose  Processors 

It  was  decided  that  special  purpose  processors 
could  not  be  treated  in  the  same  way  as  the  microcomputers. 
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However,  it  has  not  been  decided  at  this  time  exactly  how 
these  special  purpose  processors  should  be  handled.   There 
are  two  important  alternatives.   In  one  case,  a  special  pur- 
pose processor  is  treated  as  an  I/O  port  managed  by  the 
operating  system.   In  the  other  case,  a  special  purpose  pro- 
cessor can  be  operated  in  a  "slave"  mode  on  the  system  bus. 
8.   Mode  of  Data  Transfer 

The  basic  mode  of  data  transfer  in  most  of  the  mul- 
tiple processor  systems  is  based  on  the  "message  transfer" 
communication.   However,  a  basic  philosophy  of  our  operating 
system  is  the  "loop  free"  structure  which  requires  frequent 
synchronization  primitive  references.   In  other  words,  the 
operating  system  program  on  a  microcomputer  needs  to  refer- 
ence  synchronization  primitives  located  in  either  internal 
or  external  global  memories.   These  "references"  are  executed 
via  the  system  bus.   If  the  data  transfer  is  "message"  based, 
the  synchronization  of  processes  could  be  delayed  because 
the  system  bus  is  being  occupied  by  a  long  message  transfer. 
In  order  to  avoid  such  a  delay,  it  was  decided  that  the  basic 
mode  of  data  transfer  should  be  based  on  the  "word  transfer." 
This  allows  several  microcomputers  to  reference  their  synchro 
nization  primitives  and  other  data  in  an  "interleave"  mode. 

However,  the  transfer  of  data  in  "blocks"  is  possible 
if  required.   This  is  accomplished  by  a  special  feature  of 
the  Intel  16  bit  8086  microprocessor  which  can  generate  a  bus 
lock  signal  of  a  duration  specified  by  software.   This  bus 
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lock  signal  holds  the  bus  for  the  completion  of  the  block 
transfer.   Thus,  data  transfer  by  "messages  based  communica- 
tion" is  possible  as  well, 

C.   DESCRIPTION  OF  THIS  MULTIPLE  MICROCOMPUTER  SYSTEM 

1.  Introduction 

In  the  last  section,  we  have  presented  the  reasons 
for  choosing  the  specific  approaches  for  various  parts  of 
our  multiple  microcomputer  system  based  on  a  top-down  design 
procedure  to  meet  the  requirements  of  this  type  of  smart 
sensor  systems.   In  this  section,  more  detailed  description 
will  be  given  to  explain  how  those  choices  are  implemented. 
The  presentation  will  be  made  in  five  major  categories: 

System  architecture  (Section  C.2) 

Processing  resources  (Section  C.3) 

Intercommunication  network  (Section  C.4) 

Intercommunication  procedures  among  resources 
in  different  clusters  and  stars  (Section  C.5) 

Multibus  communication  (Section  C.6) 

Performance  of  this  multiple  microcomputer  system 

will  be  presented  in  Section  D. 

2.  System  Architecture 

The  topology  of  this  system  consists  of  many  "star" 
nodes  interconnected  by  links  to  nearest  neighbor  stars.   A 
two  dimensional  example  is  shown  in  Fig.  3.1.   Each  star  has 
four  links  connected  to  its  four  neighbors.   The  links  are 
bidirectional  system  buses  with  a  bus  switch,  called 
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"inter-star  bus  switch"  (ISBSW) .   The  "bus  switch"  consists 
of  60  bidirectional  switches  for  60  signal  lines.   Two  types 
of  switches  have  been  investigated;   one  with  latches  and 
one  without  latches  for  the  signal  lines. 

Each  "star"  consists  of  four  clusters  interconnected 
by  a  complete  star  "bus-switch  network."  Each  "cluster" 
consists  of  up  to  eight  microcomputers.   Other  processing 
elements  and  one  or  more  RAM  boards  are  also  connected  onto 
the  system  Multibus.   Fig.  3.2  depicts  the  topology  of  a 
single  star  with  four  clusters.   In  this  example,  the  bus 
switch  network  consists  of  six  bidirectional  system  buses, 
each  with  a  bus  switch  interconnected  as  shown  in  Fig.  3.7. 
3.   Processing  Resources 

Two  types  of  processing  resources  are  used  in  this 
system. 

a.   Basic  Processing  Elements  -  SBC  8612A 

Intel's  16  bit  single  board  microcomputers,  SBC 
8612A,  are  used  as  the  basic  processing  elements.   A  block 
diagram  of  the  SBC  8612A  is  shown  in  Fig.  3.3. 

(1)   The  Single  Board  Microcomputer  SBC-8612A. 
The  iSBC  8612A  Single  Board  Computer  is  a  16  bit  single  board 
computer,  a  complete  computer  system  on  a  single  printed  cir- 
cuit assembly.   The  iSBC  8612A  board  includes  a  16  bit  central 
processing  unit  (CPU)  up  to  32K  bytes  of  dynamic  RAM,  a  serial 
communications  interface,  three  programmable  parallel  I/O 
ports,  three  programmable  timers,  priority  interrupt  control, 
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Multibus  interface  control  logic,  and  bus  expansion  drivers 
for  interface  with  other  Multibus  interface-compatible  expan- 
sion boards.   Also  included  is  dual  port  control  logic  to 
allow  the  iSBC  8612A  board  to  act  as  a  slave  RAM  device  to 
other  Multibus  interface  masters  in  the  system.   Provision 
is  made  for  user  installation  of  up  to  16K  bytes  of  read 
only  memory. 

The  iSBC  8612A  Single  Board  Computer  is 
controlled  by  an  Intel  8086  16  bit  microprocessor  (CPU) . 
The  8086  CPU  includes  four  16  bit  general  purpose  registers 
that  may  also  be  addressed  as  eight  8  bit  registers.   In 
addition,  the  CPU  contains  two  16  bit  pointer  registers  and 
two  16  bit  index  registers.   Four  16  bit  segment  registers 
allow  extended  addressing  to  a  full  megabyte  of  memory.   The 
CPU  instruction  set  supports  a  wide  range  of  addressing  modes 
and  data  transfer  operations,  signed  and  unsigned  8  bit  and 
16  bit  arithmetic  including  hardware  multiply  and  divide,  and 
logical  and  string  operations.   The  CPU  architecture  features 
dynamic  code  relocation,  reentrant  code,  and  instruction  look- 
ahead. 

The  iSBC  8612A  board  has  an  internal  bus  for 
all  on-board  memory  and  I/O  operations  and  accesses  the  system 
bus  (Multibus  interface)  for  all  external  memory  and  I/O  oper- 
ations.  Hence,  local  (on-board)  operations  do  not  involve 
the  Multibus  interface  making  the  Multibus  interface  avail- 
able for  true  parallel  processing  when  several  bus  masters 
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(e.g.,  DMA  devices  and  other  single  board  computers)  are 
used  in  a  multimaster  scheme. 

Dual  port  control  logic  is  included  to 
interface  the  dynamic  RAM  with  the  Multibus  interface  so 
that  the  iSBC  8612A  board  can  function  as  a  slave  RAM  device 
when  not  in  control  of  the  Multibus  interface.   The  CPU  has 
priority  when  accessing  on-board  RAM.   After  the  CPU  com- 
pletes its  read  or  write  operation,  the  controlling  bus  mas- 
ter is  allowed  to  access  RAM  and  complete  its  operation. 
Where  both  the  CPU  and  the  controlling  bus  master  have  the 
need  to  write  or  read  several  bytes  or  words  to  or  from  on- 
board RAM,  their  operations  are  interleaved.   For  CPU  access, 
the  on-board  RAM  addresses  are  assigned  from  the  bottom  up 
of  the  1  megabyte  address  space;  i.e.,  00000-07FFFH.   The 
slave  RAM  address  decode  logic  includes  jumpers  and  switches 
to  allow  positioning  the  on-board  RAM  into  any  128K  segment 
of  the  1  megabyte  system  address  space. 

The  slave  RAM  can  be  configured  to  allow 
either  8K,  16K,  24K,  or  32K  access  by  another  bus  master. 
If  the  iSBC  300  Multimodule  RAM  option  is  installed,  the 
memory  increments  are  16K,  32K,  48K,  or  64K.   Thus,  the  RAM 
can  be  configured  to  allow  other  bus  masters  to  access  a 
segment  of  the  on-board  RAM  and  still  reserve  another  segment 
strictly  for  on-board  use.   The  addressing  scheme  accommodates 
both  16  bit  and  20  bit  addressing. 
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Four  IC  sockets  are  included  to  accommodate 
up  to  16K  bytes  of  user-installed  read  only  memory.   Config- 
uration jumpers  allow  read  only  memory  to  be  installed  in  2K, 
4K,  or  8K  increments. 

The  iSBC  8612A  board  includes  24  program- 
mable parallel  I/O  lines  implemented  by  means  of  an  Intel 
8255A  Programmable  Peripheral  Interface  (PPI) .   The  system 
software  is  used  to  configure  the  I/O  lines  in  any  combina- 
tion of  unidirectional  input/output  and  bidirectional  ports. 
The  I/O  interface  may  be  customized  to  meet  specific  periph- 
eral requirements  and,  in  order  to  take  full  advantage  of  the 
large  number  of  possible  I/O  configurations,  IC  sockets  are 
provided  for  interchangeable  I/O  line  drivers  and  terminators. 
Hence,  the  flexibility  of  the  parallel  I/O  interface  is  fur- 
ther enhanced  by  the  capability  of  selecting  the  appropriate 
combination  of  optional  line  drivers  and  terminators  to  pro- 
vide the  required  sink  current,  polarity,  and  drive/termination 
characteristics  for  each  application.   The  24  programmable 
I/O  lines  and  signal  ground  lines  are  brought  out  to  a  50  pin 
edge  connector  (Jl)  that  mates  with  flat,  woven,  or  round 
cable. 

The  RS232C  compatible  serial  I/O  port  is 
controlled  and  interfaced  by  an  Intel  8251A  USART  (Universal 
Synchronous/Asynchronous  Receiver/Transmitter)  chip.   The 
USART  is  individually  programmable  for  operation  in  most 
synchronous  or  asynchronous  serial  data  transmission  formats 

(including  IBM  Bi-Sync) . 
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In  the  synchronous  mode  the  following  are 
programmable ; 

a.  Character  length, 

b.  Sync  character  (or  characters) ,  and 

c.  Parity. 

In  the  asynchronous  mode  the  following  are 
programmable: 

a.  Character  length, 

b.  Baud  rate  factor  (clock  divide  ratios  of  1,  16,  or  64), 

c.  Stop  bits,  and 

d.  Parity. 

In  both  the  synchronous  and  asynchronous 
modes,  the  serial  I/O  port  features  half-  or  full-duplex, 
double  buffered  transmit  and  receive  capability.   In  addi- 
tion, USART  error  detection  circuits  can  check  for  parity, 
overrun,  and  framing  errors.   The  USART  transmit  and  receive 
clock  rates  are  supplied  by  a  programmable  baud  rate/time 
generator.   These  clocks  may  optionally  be  supplied  from  an 
external  source.   The  RS232C  command  lines,  serial  data  lines, 
and  signal  ground  lines  are  brought  out  to  a  50  pin  edge  con- 
nector (J2)  that  mates  with  flat  or  round  cable. 

Three  independent,  fully  programmable  16  bit 
interval  timer/event  counters  are  provided  by  an  Intel  8253 
Programmable  Interval  Timer  (PIT)  .   Each  counter  is  capable 
of  operating  in  either  BCD  or  binary  modes;  two  of  these 
counters  are  available  to  the  system's  designer  to  generate 
accurate  time  intervals  under  software  control.   Routing  for 
the  outputs  and  gate/trigger  inputs  of  two  of  these  counters 
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may  be  independently  routed  to  the  8259A  Programmable  Inter- 
rupt Controller  (PIC) .   The  gate/trigger  inputs  of  the  two 
counters  may  be  routed  to  I/O  terminators  associated  with 
the  8255A  PPI  or  as  input  connections  from  the  8255A  PPI. 
The  third  counter  is  used  as  a  programmable  baud  rate  gener- 
ator for  the  serial  I/O  port.   In  utilizing  the  iSBC  8612A 
board,  the  systems  designer  simply  configures,  via  software, 
each  counter  independently  to  meet  system  requirements. 
Whenever  a  given  time  delay  or  count  is  needed,  software 
commands  to  the  8253  PIT  to  select  the  desired  function. 
The  contents  of  each  counter  may  be  read  at  any  time  during 
system  operation  with  simple  operations  for  event  counting 
applications,  and  special  commands  are  included  so  that  the 
contents  of  each  counter  can  be  read  "on  the  fly." 

The  iSBC  8612A  board  provides  vectoring  for 
bus  vectored  (BV)  and  non-bus  vectored  (NBV)  interrupts.  An 
on-board  Intel  8259A  Programmable  Interrupt  Controller  (PIC) 
handles  up  to  eight  NBV  interrupts.  By  using  external  PICs 
slaved  to  the  on-board  PIC  (master) ,  the  interrupt  structure 
can  be  expanded  to  handle  and  resolve  the  priority  of  up  to 

64  BV  sources. 

The  PIC,  which  can  be  programmed  to  respond 
to  edge-sensitive  or  level-sensitive  inputs,  treats  each 
"true"  input  signal  condition  as  an  interrupt  request.   After 
resolving  the  interrupt  priority,  the  PIC  issues  a  single 
interrupt  request  to  the  CPU.   Interrupt  priorities  are 
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independently  programmable  under  software  control.   The 
programmable  interrupt  priority  modes  are: 

(a)  Nested  Priority.   Each  interrupt 
request  has  a  fixed  priority:   input  0  is  highest,  input  7 
is  lowest. 

(b)  Fully  Nested  Priority.   This  mode  is 
the  same  as  the  nested  mode,  except  that  when  a  slave  PIC  is 
being  serviced,  it  is  not  locked  out  from  the  master  PIC 
priority  logic  and  when  exiting  from  the  interrupt  service 
routine,  the  software  must  check  for  pending  interrupts  from 
the  slave  PIC  just  serviced. 

(c)  Auto-Rotating  Priority.   Each  interrupt 
request  has  equal  priority.   Each  level,  after  receiving 
service,  becomes  the  lowest  priority  level  until  the  next 
interrupt  occurs. 

(d)  Specific  Priority.   Software  assigns 
lowest  priority.   Priority  of  all  other  levels  is  in  numer- 
ical sequence  based  on  lowest  priority. 

(e)  Special  Mask.  Interrupts  at  the  level 
being  serviced  are  inhibited,  but  all  other  levels  of  inter- 
rupts (higher  and  lower)  are  enabled. 

Cf)   Poll.   The  CPU  internal  interrupt 
enable  is  disabled.   Interrupt  service  is  achieved  by  pro- 
grammer initiative  using  a  Poll  command. 

The  CPU  includes  a  non-maskable  interrupt 
(NMI)  and  a  maskable  interrupt  (INTR) .   The  NMI  interrupt  is 
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intended  to  be  used  for  catastrophic  events  such  as  power 
outages  that  require  immediate  action  of  the  CPU.   The  INTR 
interrupt  is  driven  by  the  8259A  PIC  which,  on  demand,  pro- 
vides an  8  bit  identifier  of  the  interrupting  source.   The 
CPU  multiplies  the  8  bit  identifier  by  four  to  derive  a 
pointer  to  the  service  routine  for  the  interrupting  device. 

Interrupt  requests  may  originate  from  18 
sources  without  the  necessity  of  external  hardware.   Two 
jumper-selectable  interrupt  requests  can  be  automatically 
generated  by  the  Programmable  Peripheral  Interface  (PPI)  when 
a  byte  of  information  is  ready  to  be  transferred  to  the  8086 
CPU  (i.e.,  input  buffer  is  full)  or  a  byte  of  information  has 
been  transferred  to  a  peripheral  device  (i.e.,  output  buffer 
is  empty).   Two  jumper-selectable  interrupt  requests  can  be 
automatically  generated  by  the  USART  when  a  character  is 
ready  to  be  transferred  to  the  8086  CPU  (i.e.,  receive  channel 
buffer  is  full)  or  when  a  character  is  ready  to  be  transmitted 
(i.e.,  transmit  channel  data  buffer  is  empty).   A  jumper- 
selectable  interrupt  request  can  be  generated  by  two  of  the 
programmable  counters  and  eight  additional  interrupt  request 
lines  are  available  to  the  user  for  direct  interfaces  to 
user  designated  peripheral  devices  via  the  Multibus  interface. 
One  interrupt  request  line  may  be  jumper  routed  directly  from 
a  peripheral  via  the  parallel  I/O  driver/terminator  section 
and  one  power  fail  interrupt  may  be  input  via  auxiliary 
connector  P2. 
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The  iSBC  8612A  board  includes  the  resources 
for  supporting  a  variety  of  original  equipment  manufacturer 
system  requirements.   For  those  applications  requiring  addi- 
tional processing  capacity  and  the  benefits  of  multiprocessing 
(i.e.,  several  CPUs  and/or  controllers  logically  sharing 
systems  tasks  with  communication  over  the  Multibus  interface) , 
the  iSBC  8612A  board  provides  full  bus  arbitration  control 
logic.   This  control  logic  allows  up  to  three  bus  masters 
(e.g.,  combination  of  iSBC  8612A  board,  DMA  controller, 
diskette  controller,  etc.)  to  share  the  Multibus  interface 
in  serial  (daisy-chain)  fashion  or  up  to  16  bus  masters  to 
share  the  Multibus  interface  using  an  external  parallel  pri- 
ority resolving  network. 

The  Multibus  interface  arbitration  logic 
operates  synchronously  with  the  bus  clock,  which  is  derived 
either  from  the  iSBC  8612A  board  or  can  be  optionally  gen- 
erated by  some  other  bus  master.   Data,  however,  are  trans- 
ferred via  a  handshake  between  the  controlling  master  and  the 
addressed  slave  module.   This  arrangement  allows  different 
speed  controllers  to  share  resources  on  the  same  bus,  and 
transfers  via  the  bus  proceed  asynchronously.   Thus,  the 
transfer  speed  is  dependent  on  transmitting  and  receiving 
devices  only.   This  design  prevents  slower  master  modules 
from  being  handicapped  in  their  attempts  to  gain  control  of 
the  bus,  but  does  not  restrict  the  speed  at  which  faster 
modules  can  transfer  data  via  the  same  bus.   The  most  obvious 
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applications  for  the  master-slave  capabilities  of  the  bus 
are  multiprocessor  configurations,  high  speed  direct  memory 
access  (DMA)  operations,  and  high  speed  peripheral  control, 
but  are  by  no  means  limited  to  these  three. 

Adding  the  optional  iSBC  300  Multimodule 
RAM  to  the  iSBC  8612A  board,  allows  the  on-board  RAM  to  be 
expanded  by  32K  (for  an  on-board  total  of  64K) .   If  the 
optional  iSBC  340  Multimodule  EPROM  is  installed  on  the  iSBC 
8612A  board,  the  amount  of  on-board  ROM/EPROM  can  be  expanded 
by  16K  (for  an  on-board  total  of  32K) . 

b.  Special  Processing  Elements 

Special  purpose  processing  elements  will  also 
be  used  in  this  system  to  enhance  processing  capabilities. 
Typical  examples  are  array  processors,  FFT,  correlators,  etc. 
However,  they  have  not  been  included  in  this  thesis  project. 

c.  Memories 

Three  types  of  memories  are  provided. 

(1)   Secondary  Memory.   It  consists  of  two  mag- 
netic cartridge  hard  discs  and  a  dual  drive  floppy  diskette 
system.   The  magnetic  hard  disc  is  manufactured  by  the  DYNEX 
Company  and  has  a  storage  capacity  of  10  megabytes.   This 
hard  disc  system  is  connected  to  the  system  Multibus,  thus 
allows  fast  data  transfer  rate  and  has  DMA  capability.   Its 
interface  to  the  Multibus  is  made  by  the  Interphase  Corp. 
The  dual  floppy  diskette  drive  is  a  part  of  the  Intel  MDS-220 
development  system. 
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(2)   Primary  Memory.   It  consists  of  dynamic 
RAM  and  EPROM  (Erasable  Programmable  Read  Only  Memory) .   The 
EPROMs  reside  in  each  SBC  [8K  byte  to  16K  byte  per  SBC).   It 
can  be  used  as  the  monitor  storage,  and  to  store  part  of  the 
operating  system.   The  RAMs  reside  in  two  types  of  physical 
locations.   The  first  location  is  on  each  SBC  and  has  a 
capacity  up  to  64K  bytes.   The  second  type  of  location  is  on 
separate  RAM  boards.   A  128K  byte  RAM  board  developed  by  the 
MUPRO  Company  is  used.   The  RAM  in  the  SBC  is  a  dual  ported 
RAM  which  can  be  shared  with  other  SBCs  via  the  Multibus 
interface.   Part  or  all  of  the  dual  ported  RAM  can  be  made 
accessible  only  to  the  on-board  CPU;  in  other  words,  made 
"private"  and  "unshared"  to  the  SBC.   The  stand-alone  RAM 
boards  are  shared  with  other  SBCs  via  the  Multibus  interface 
d.   Memory  Hierarchy 

The  primary  memory  of  this  type  is  partitioned 
according  to  the  following  hierarchical  scheme. 

A)  Private  Unshared  Memory  -  RAMs  available  on  each  SBC 
which  can  be  accessed  only  by  the  on-board  CPU. 

B)  Internal  Global  Shared  Memory  -  Internal  global 
shared  RAM  available  on  each  SBC  and  special  RAM 
boards.   The  on-board  RAM  in  the  SBC  is  a  dual 
ported  RAM  and  can  be  accessed  by  any  SBC  which 
is  a  member  of  that  cluster  (unaccessible  to  PE 
in  other  clusters).   See  Section  C.3.a.l. 

C)  External  Global  Shared  Memory  -  External  global 
shared  RAMs  reside  in  special  RAM  boards  and/or  in 
dual  ported  RAM  of  the  SBCs.   These  memories  can  be 
accessed  by  any  SBCs  in  the  same  "star,"  and  any 
SBCs  in  the  corresponding  clusters  in  neighboring 
stars . 
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Using  this  memory  hierarchy,  the  total  address 
space  can  be  expanded  from  the  physical  memory  address  space 

of  each  CPU,   The  8086  microprocessor  has  20  address  lines  so 

20 
its  physical  address  space  is  C2   )  =   1,048,576  bytes,  or 

1M  bytes. 

In  this  implementation,  the  total  address  space 
(memory  space)  for  a  single  star  is  partitioned  in  the  follow- 
ing way: 

(1)  Private  Memory 

6  yC  in  each  cluster  8  yC  in  each  cluster 

2  -65,536  +  4  •  (6  5 ,  536  -  8 ,  192)  4  •  64K  +  4  .  (64K  -  8K) 
=  360,448  bytes/cluster  =  480K  bytes/cluster 
2  .  64K  +  4  •  (64K  -  8K)  =  491,520  bytes/cluster 

=  352  Kbytes/cluster-^ 

(2)  Internal  Global  Memory 

6  yC/CL  8  yC/CL 

1  M  bytes  •  -r  1  M  bytes  •  j 

=  768K  byte/cluster  =  768K  byte/cluster 

=  786,432  bytes/cluster  =  786,432  bytes/cluster 

(3)   External  Global  Memory 
6  yC/CL  8  yC/CL 

32K  byte/cluster  32K  bytes/cluster 

=  32,768  bytes/cluster        =  32,768  bytes/cluster 

As  described  before,  a  "star"  consists  of  four  clusters, 
thus  the  total  memory  space  for  a  single  star  is: 


31  K  bytes  =  1024  bytes. 
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6  yC/CL 

4  •  (352K  +  768K  +  32K) 
=  4,608K  bytes/star 
=  4,718,592  bytes/star 


8  yC/CL 

4  •  (480K  +  768K  +  32K) 
=  5,120K  bytes/star 
■  5,242,880  bytes/star 


This  expanded  memory  space  can  be  determined  in  general  as: 

MS  =  Memory  space 

CL  =  Number  of  clusters  in  a  "star" 

PM  =  Private  memory.   In  K  bytes. 

GIM  =  Global  internal  memory.   In  K  bytes. 

GEM  =  Global  external  memory.   In  K  bytes. 
N 


=  Number  of  SBCs . 
N 
MS   =  CL  •  i    PM.  +  GIM  +  GEM 
i  =  l  1 


(3.0) 


If  all  SBCs  are  assigned  the  same  amount  of  private  memory, 
then  (3.0)  becomes 


MS  =  CL  •  (N  •  PM  +  GIM  +  GEM) 


(3.1) 


The  reason  for  computing  the  memory  space  for  6  microcom- 
puters and  for  8  microcomputers  in  a  cluster  is  mainly 
because  of  power  supply  considerations.   The  available  power 
supply  can  handle  up  to  6  SBCs  in  a  cluster.   However,  the 
controller  for  intercommunication  is  designed  for  8  SBCs. 
4.   Intercommunication  Network 

In  order  to  establish  fast,  reliable  and  high 
of  fault  toleran   communication  among  SBCs  of  different 
clusters  and  stars,  three  level  communication  controllers 
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were  designed,  built  and  tested.   They  include  a  combination 
of  random  priority,  distributed,  and  central  controllers  as 
shown  in  Fig.  3,4  for  a  single  star.   Each  cluster  has  its 
own  distributed  controller.   Each  star  has  four  such  control- 
lers.  The  four  clusters  share  one  central  controller.   The 
four  distributed  controllers  are  identical,  and  have  some 
degree  of  programmability . 

a.  Distributed  Controllers  (DC) 

A  block  diagram  of  the  distributed  controller  is 
depicted  in  Fig.  3.5.   It  resides  on  a  single  board  located 
in  each  cluster.   Its  primary  functions  are  the  following: 

1)  Arbitration  among  Internal/External  bus  requests 
from  within  and  outside  the  cluster. 

2)  Priority  resolving. 

3)  Inter-cluster  advance  activities  monitoring. 

4)  Interacting  with  the  central  controller. 

5)  Deadlock  avoidance. 

b.  Random  Priority  Controller  (RPC) 

The  RPC  is  a  bus  contention  resolver  based  on 
a  binary  tree  approach.   The  RPC  accepts  up  to  eight  "Bus 
Requests"  (BREQ)  and  issues  a  single  "Bus  Priority  In"  (BPRN) 
signal.   BREQ  is  a  signal  generated  by  the  bus  arbiter  which 
resides  on-board  the  SBC  to  indicate  that  this  particular  SBC 
requires  the  control  of  the  cluster  system  bus  (Multibus)  for 
one  or  more  data  transfers.   BPRN  is  a  signal  generated  by 
the  RPC  to  indicate  to  the  requesting  SBC  that  control  of  the 
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cluster  bus  is  granted.   Prior  to  issuance  of  a  BPRN,  the 
RPC  generates  an  "advanced  bus  priority  in"  signal  (intra- 
cluster  advance  activities  monitor  BPRN*)  which  is  sent  to 
the  ICAAM  as  a  "port  selector"  signal.   This  signal  starts 
a  chain  of  logical  activities  which  eventually  causes  the  DAC 
(deadlock  avoidance  circuit)  to  send  two  signals,  i.e.,  BHD 
(bus  hold)  and  PRE  (priority  enable)  to  the  RPC.   When  the 
appropriate  BHD  and  PRE  are  received  by  the  RPC,  it  will 
generate  the  BPRN  signal.   BHD  is  a  positive  logic  signal 
which  enables  the  tristate  output  of  the  RPC  to  allow  BPRN* 
to  propagate  and  become  a  BPRN  signal,  when  the  PRE  signal 
is  enabled.   If  BHD  goes  low,  it  disables  all  PRN*.   PRE  is 
a  negative  logic  signal  which  is  generated  in  the  DAC  circuit 
When  the  PRE  signal  is  generated,  it  disables  requests  from 
other  clusters  and  enables  the  output  driver  of  the  RPC  to 
send  the  BPRN. 

The  RPC  has  an  internal  clock  to  synchronize  its 
arbitration  function.   More  details  can  be  found  in  Section 
C.4.b. 

ICAAM  (Intra-Cluster  Advance  Activities  Monitor) 
has  a  multiplexer  which  selects  two  signals,  MSBT  (most 
significant  address  bits,  5  bits  out  of  20)  and  ADRDC/ADWTC 
(advance  read  command/advance  write  command)  when  a  BPRN* 
is  received  from  the  RPC.   By  analysing  the  MSBT,  the  ICAAM 
generates  a  bus  request  of  one  of  the  following  types: 
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1)  Intra-cluster  bus  request.   It  is  a  request  for  the 
system  bus  in  the  same  cluster  only.   In  response 
to  this  request,  the  ICAAM  generates  a  IREQ  signal. 

2)  Inter-cluster  bus  request.   It  is  one  out  of  four 
cluster  requests  generated  by  the  ICAAM  of  the 
distributed  controller.   Each  CLREQ*  requests 
three  resources:   one  system  bus  of  the  requesting 
cluster,  one  system  bus  of  the  requested  cluster 
and  one  inter-connecting  bus  switch.   Following  a 
CLREQ*,  the  ICAAM  also  creates  an  EXREQ  for  the  CIC 
(coincidence  inhibit  circuit) . 

3)  Inter-star  bus  request.   This  request,  labeled 
STREQ*,  involves  three  resources:   the  system  bus 
of  a  cluster  in  the  requesting  star,  the  system 
bus  of  the  corresponding  cluster  in  the  requested 
star,  and  the  inter-connecting  bus  switch  between 
these  two  stars.   Following  a  STREQ*  signal,  the 
ICAAM  also  creates  an  EXREQ  for  the  CIC. 

The  ICAAM  also  generates  an  advanced  read  command 
(ADRDC)  or  advance  write  command  (ADWTC)  before  the  corre- 
sponding read  command  (MRDC)  or  write  command  (MWTC)  is 
generated  by  the  bus  controller  of  the  requesting  SBC.   This 
is  done  by  monitoring  the  activities  of  the  CPU  of  the  re- 
questing SBC  before  the  CPU  grants  the  system  bus.   Those 
signals  are  needed  to  determine  the  direction  of  the  drivers 
in  the  bus  switch  in  advance,  so  that  all  switching  transients 
are  settled  before  a  data  transfer  takes  place. 

CIC  (Coincidence  Inhibiter  Circuit)  -  The  CIC 
accepts  five  signals  as  inputs:   one  STPRN  (star  priority  in), 
three  (cluster  priority  in)  from  the  central  controller  and 
one  IREQ/EXREQ  from  ICAAM.   It  generates  one  output  signal 
INH  (inhibit)  for  the  DAC  (deadlock  avoidance  circuit) .   The 
primary  function  of  the  CIC  is  to  inhibit  a  BPRN  from  the  RPC 


191 


in  case  that  a  CLREQ*  or  STREQ*  were  issued  by  the  ICAAM, 
until  either  a  CLPRN  or  a  STPRN  is  granted  by  the  central 
controller  to  the  CIC   The  necessity  of  this  signal  INH 
is  to  prevent  the  system  bus  to  be  tied  down  in  waiting  until 
the  inter-cluster  request  is  granted  and  allow  efficient  bus 
usage  and  reduce  bus  contention. 

DAC  (Deadlock  Avoidance  Circuit) .   A  "deadlock" 
is  a  situation  in  which  two  processes  are  unknowingly  wait- 
ing for  resources  that  are  held  by  each  other  and  thus  un- 
available [192].   More  details  can  be  found  in  Section  C.5.d.,e 
The  primary  function  of  the  DAC  is  to  prevent  deadlock.   Its 
principle  is  similar  to  the  "Suspend"  Lock  method  [Ref.  193]. 
The  DAC  accepts  four  input  signals:   ANREQ  (any  request), 
INH,  STREQ,  CLREQ  and  generates  three  signals:   BHD  (bus 
hold) ,  PRE  (priority  enable)  and  CL/STPRN.   Three  cases  will 
be  described  to  explain  the  operations  of  DAC  depending  on 
the  occurrence  of  either  the  CLREQ  (or  STREQ)  and  the  INH 
signals. 

(Case  1)  -  A  CLREQ  (or  STREQ)  occurs  prior  to 
the  INH  signal,  the  CL/STPRN  signal  will  be  granted.   In  this 
case,  BHD  will  go  low  and  PRE  high,  thus  freezing  the  selected 
request  in  the  RPC,  disabling  the  BPRN*  which  will  release 
all  the  resources  held  by  the  appropriate  SBC  via  the  BPRN* 
signal  (ICAAM,  CCU-I).   About  30  nsec  later,  a  CL/STPRN 
will  be  generated  by  the  DAC.   This  allows  the  appropriate 
processing  element  to  grant  the  system  bus. 
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[Case  2)  -  A  CLREQ  (or  STREQ)  signal  occurs 
after  the  INH  signal,  the  CL/STPRN  signal  will  be  blocked. 
It  indicates  that  the  system  bus  is  in  use.   In  this  case, 
BHD  is  high  and  PRE  goes  low,  BPRN  will  be  granted. 

(Case  3)  -  If  the  INH  signal  and  CLREQ  (STREQ) 
signal  occur  simultaneously  within  a  time  window  of  15  nsec, 
the  CLREQ  (or  STREQ)  signal  will  be  blocked  as  before.  In 
case  of  any  occurrence  of  a  transient  CL/STPRN  signal,  the 
"GLITCH  KILLER"  will  suppress  it  and  prevent  the  transient 
from  propagating  to  the  central  controller, 
c.   Central  Controller  (CC) 

The  central  controller  is  a  single  board  control- 
ler, which  consists  of  two  clocks  and  four  identical  units, 
each  corresponding  to  one  cluster  in  the  star.   The  primary 
functions  of  the  CC  are: 

1)  To  arbitrate  among  different  CLREQ  and  STREQ  to  a 
single  cluster. 

2)  Enable  and  disable  the  CL/STPRN  signal  chain. 

3)  Enable  and  disable  the  appropriate  bus  switch  links 
of  the  complete  star  switch. 

A  block  diagram  of  the  CC  is  presented  in  Fig.  3.6. 

CLK-1  -  Clock  1  is  the  main  clock  of  the  central 
controller,   Its  frequency  is  30  MHZ.   It  is  used  to  synchro- 
nize and  enable  the  arbitration  function  of  the  CSRA  (cluster/ 
star  request  arbitor)  and  the  four-phase  clock,  CLK-2. 

CLK-2  -  Clock  2  is  a  four-phase,  anti-coincidence 
clock.   Its  input  is  CLK-1  which  generates  four  clocks,  one 
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each  for  four  CSRAs .   The  functions  of  the  four-phase  clock 

are; 

1)  To  synchronize  the  CLREQ  (or  STREQ)  chain  action  via 
the  CSRA  in  order  to  prevent  deadlocks.   The  deadlock 
avoidance  method  used  in  this  implementation  is  similar 
to  the  "spinning  lock"  method  [192].   The  spinning 
lock  is  rotating  at  a  frequency  of  3.75  MHZ  (30/8  MHZ). 

CSRA  (Cluster/Star  Request  Arbiter)  -  The  CSRA 

is  a  rotating  priority  resolver.   Its  primary  functions  are: 

1)  To  arbitrate  among  requests  from  three  other  clusters 
within  the  same  star  and  from  the  corresponding 
cluster  in  the  neighboring  star. 

2)  To  enable  the  selected  request,  after  being  synchro- 
nized with  the  spinning  lock,  to  propagate  to  the 
requested  cluster. 

The  CSRA  accepts  four  different  requests  to  a  single  cluster 
and  grants  one  of  them  according  to  a  rotating  priority  scheme. 

CSPE  (Cluster/Star  Priority  In  Enable)  -  the  CSPE  '  * 
is  a  demultiplexer  whose  primary  function  is  to  enable  the 
CL/STPRN  chain  action.   The  CSPE  is  synchronized  by  the  CSRA. 
When  a  CLPRN  is  received  from  the  requested  cluster,  the  CSPE 
will  enable  the  CLPRN  chain  action  to  the  selected  requesting 
cluster. 

SSEC  (Star  Switch  Enable  Circuit)  -  The  SSEC 
consists  of  a  set  of  six  drivers.   It  accepts  the  different 


CLPRNs  and  generates  two  signals,  ECC,  DIR,  DIR.   ECC  is  a 
negative  logic  signal  which  enables  one  of  the  bus  switch 
links  corresponding  to  the  CLPRN  signal.   DIR  is  a  signal 
which  sets  the  requesting  direction  of  the  drivers  in  the 


selected  link  of  the  "complete  star"  bus  switch.   DIR  is 
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the  inverted  DIR  signal.   The  SSEC  is  responsible  for  the 
enabling  of  the  si*  different  links  of  the  complete  star  bus 
switch  as  depicted  in  Fig.  3,7. 

5.   Intercommunication  Procedures  Among  Resources 

Communication  among  the  resources  of  this  system  is 
governed  by  the  following  basic  concepts:   Explicitly  seg- 
mented memory;  unshared  local  and  shared  global  internal/ 
external  memory  hierarchy,  asynchronous  process  structure  and 
a  design  decision  that  each  single  board  computer  is  allowed 
to  use  the  system  bus  for  transfer  of  only  one  word  of  data 
and  then  must  release  the  system  bus  to  other  SBCs  except 
when  a  prefix  lock  is  executed  by  software.   A  software  lock 
will  grant  the  bus  to  that  SBC  for  any  length  of  time  needed 
by  that  SBC.   In  general,  this  feature  is  not  required  fre- 
quently so  the  operating  system  will  not  normally  be  delayed 
waiting  for  the  system  bus  to  be  released  in  order  to  test  a 
semaphore,  or  any  other  synchronization  primitives. 

In  order  to  provide  effective  communication  among  all 
processing  elements  (within  a  single  cluster,  among  different 
clusters  in  a  single  "star,"  and  among  "stars")  and  to  arbi- 
trate the  contention  of  bus  usage  (in  star  bus  switch  and 
inter-star  bus  switches) ,  we  have  developed  an  intercommuni- 
cations system  managed  by  distributed  and  central  controllers, 
as  described  in  Chapter  III. D. 4., 5. 

In  order  to  describe  the  communication  protocol  among 
different  SBCs,  a  two  "star"  system  is  chosen  -  STAR-1,  STAR- 2 
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as  depicted  in  Fig,  3.8.   Several  examples  of  different 
types  of  communication  are  presented. 

a.   Example  #1  -  Intra-Cluster  Communication 

Intra-cluster  communication  is  accomplished  by 
means  of  data  transfer  via  the  cluster  Multibus.   This  type 
of  communication  does  not  involve  the  central  controller  or 
any  bus  switch.   The  distributed  controller  resident  in  the 
specific  cluster  and  on-board  SBCs  are  the  controllers  of 
this  communication  link. 

For  example,  let  us  assume  SBC-1  in  cluster  Al 
requests  some  information  from  SBC-2  in  the  same  cluster. 
The  sequence  of  events  (Fig.  3.9)  is: 

a)  SBC-1  generates  BREQ  signal. 

b)  The  RPC  of  the  distributed  controller  will  grant 
the  request  and  generates  a  BPRN*  signal. 

c)  The  ICAAM  of  the  distributed  controller  will 
generate  an  IREQ  signal,  for  the  inhibiter. 

d)  From  the  IREQ,  the  "IHC"  generates  an  inhibit 
signal  which  causes  the  DAC  to  send  appropriate 
BHD  and  PRE  signals. 

e)  These  two  signals  are  sent  to  the  RPC  to  close  the 
chain  and  a  BPRN  is  generated. 

f)  The  BPRN  signal  is  applied  to  the  arbiter  circuit 
of  the  corresponding  SBC.   From  this  point,  a 
regular  Multibus  transfer  is  executed. 

These  six  events  are  necessary  to  establish  any 

intra-cluster  communication.   But  they  are  not  sufficient. 

The  following  conditions  corresponding  to  the  requests  from 

other  clusters  and  stars  must  be  examined: 


198 


r\ 


CM 


PC 

< 

00 


00 
« 
CO 


6 


W 


/~\ 


< 

E- 
00 


vj 


/"-\ 


rg 


< 

E- 
00 


W 


/~\ 


Pi 

< 
H 
00 


V_/ 


V) 

3 
o 

•H 
4-» 

u 

0> 

c 
c 
o 
u 
u 

CD 

c 


u 
+-» 

00 

03 


T3 
3 
rt 

cd 

4-» 

00 

I 

f-. 
+-> 

3    W 

i—i    CD 

M  U 
3  +-> 


•H 

o 

00 


2 

00 

3 

pa 


e 

cd 

M  3 

03  -H 
•H  tfl 
P  33 

oo 

CO 

<D 
5-. 
3 
00 
•  H 


5h 

o3 

00 
I 

x:  s-< 

U  -M 

+-»  3 

•  H  I— I 

£  w 

00 

X 

75    U 

3  +-» 
^  00 

03 

■M    (/) 

00    3 

i  03 
S-. 

CD  f-i 
4->  03 
3  -M 
i— i  CO 


oo  3= 

CO  00 

00  PQ 

h-i  CO 


199 


c 
o 

•H 

p 

U 

•H 

c 

3 


O 

u 

<D 

p 

V) 

3 

rH 
U 

I 

o3 
U 
P 


O 

e 

03  /-> 

00  o3 
03  P 
•H  CO 
P      . 
03 

fn 

P 

C 


o 


03 


P  i— i 
CO  ^ 

en 

<L> 
5h 
3 
GO 
•H 


200 


1)  Is  there  any  other  cluster  in  process  of  communica- 
tion with  this  cluster? 

2)  Is  there  any  other  star  in  process  of  communication 
with  this  cluster? 

For  simplicity  of  this  example,  we  assumed  that 
no  external  requests  were  involved  in  the  process  of  intra- 
cluster  communication. 

Upon  termination  of  the  data  transfer  via  the 

system  bus,  SBC-1  releases  its  BREQ  signal  which  releases 

all  sources  held  by  SBC-1.   The  average  time  of  word  transfer 

is  1.65  ysec. 

b.   Example  2  -  Inter-Cluster  Communication 
(within  a  Star) 

Inter-cluster  communication  is  accomplished  by 
means  of  data  transfer  via  two  clusters'  system  buses  (Multi- 
bus) and  the  bus  switch  interconnecting  those  two  clusters. 
This  type  of  communication  involves  all  controllers,  the  star 
bus  switch,  and  the  on-board  SBC  arbiter.   (See  Fig.  3.10). 

Assume  that  SBC-1  in  cluster  Al  requests  some 
information  from  SBC-1  in  cluster  Bl.   The  sequence  of  events 
is : 

1)  SBC-1  of  Al  generates  BREQ  signal. 

2)  The  RPC  of  the  distributed  controller  in  cluster  Al 
locks  on  the  request  and  generates  a  BPRN*  signal. 

3)  The  BPRN*  signal  is  applied  to  the  ICAAM  of  the 
distributed  controller. 

4)  The  ICAAM  generates  two  signals:   CLREQ-B1,  which 
propagates  to  the  rotating  priority  arbiter  of  the 
central  controller  unit  B  and  "EXREQ"  which  is 
applied  to  the  MCIC"  coincidence  inhibiter  of  the 
distributed  controller  of  cluster  Al. 
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5)  The  "CIC"  coincidence  inhibiter  generates  an  appro- 
priate INH  signal  which  will  cause  the  distributed 
controller  in  cluster  A  to  wait  for  a  CLPRN  from 
the  demultiplexer  of  the  central  controller,  unit  B. 

6)  The  "cluster/star  request  arbiter"  in  the  central 
controller  locks  on  the  CLREQ-B1  signal  and  waits 
for  the  spinning  lock  to  enable  the  CLREQ  chain 
action  and  locks  on  the  request. 

7)  The  CLREQ  signal  is  applied  to  the  DAC  of  the  dis- 
tributed controller  of  cluster  Bl. 

8)  The  DAC  of  the  distributed  controller  of  cluster  Bl 
generates  a  CLPRN  signal  which  is  applied  to  the 
demultiplexer  of  unit  B  of  the  central  controller. 

9)  The  central  controller  enables  the  CLPRN  signal  to 
the  "DAC"  of  the  distributed  controller  in  cluster 
A  which  generates  appropriate  BHD  and  PRE  signals. 

10)  The  BHD  and  PRE  signals  are  applied  to  the  ROC  and 
closes  the  chain  action.   The  RPC  then  generates 
the  BPRN  signal. 

11)  The  BPRN  signal  is  applied  to  the  on-board  SBC-1 
arbiter  which  starts  the  regular  Multibus  communi- 
cation. 

12)  After  the  event  #9,  a  parallel  process  is  initialized. 
This  process  is  the  bus  switch  enable.   Two  signals, 
DIR  and  ECC,  are  sent  to  the  bus  switch  which  links 
the  buses  of  cluster  Al  and  cluster  Bl. 

13)  Those  two  signals  prepare  the  switch  for  the  coming 
data  transfer. 

The  initialization  of  the  bus  switch  terminates 
200  nsec  before  the  transfer  of  data  via  the  bus  (switch) . 
This  feature  makes  the  bus  switch  transparent  to  the  request- 
ing cluster,  and  both  clusters  are  linked  on  a  longer  system 
bus  for  the  time  the  transfer  takes  place.   SBC  1  in  cluster 
Al  can  use  the  "longer"  system  bus  (two  system  buses  and  the 
plus  switch)  for  more  than  one  word  transfer,  if  this  feature 
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is  requested  by  a  software  bus  lock  instruction  from  SBC  1. 
Termination  of  this  process  is  started  by  releasing  the  BREQ 
signal  by  SBC-1  of  cluster  Al,   This  event  releases  all 
resources  held  by  SBC  1  of  cluster  Al. 

The  sequence  of  events  described  in  this  example 
is  necessary  for  this  type  of  communication.   Other  external 
events  were  not  introduced  in  order  to  simplify  the  example. 
This  sequence  of  events  takes  place  in  an  average  time  of 
2.1  ysec. 

c.   Example  #3  -  Inter-Star  Communication 

Inter-star  communication  is  accomplished  by  means 
of  data  transfer  via  the  system  buses  of  two  clusters  and  the 
bus  switch  interconnecting  these  two  clusters.   This  type  of 
communication  involves  all  controllers,  and  the  bus  switch 
interconnecting  the  two  clusters.   The  sequence  of  events  is 
similar  to  the  previous  example.   Instead  of  the  CLREQ  signal, 
a  STREQ  signal  is  applied  to  the  central  controller.   The 
responding  signal  is  STPRN.   (See  Fig.  3.11). 

Examples  1,  2,  and  3  described  a  case  of  separable 
communication  levels.   In  a  real  application,  the  situation 
can  be  more  complicated.   For  example,  a  simultaneous  com- 
bination of  the  three  different  examples  is  possible.   In 
such  a  case,  deadlocks  could  occur  frequently  [193].   In 
order  to  prevent  those  deadlocks,  two  methods  of  deadlock 
avoidance  are  used. 
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"Suspend  Lock"  -  This  method  is  implemented  in 

the  DAC  of  the  distributed  controller.   In  order  to  explain 

how  this  method  works,  the  following  example  is  used. 

d.   Example  #4  -  Deadlock  Avoidance  I  - 
Suspend  Lock 

SBC-i  in  cluster  Al  of  star  1  requests  SBC-j  in 

cluster  A2  of  star  2  (process  PI,  and  SBC-k  in  cluster  A2  of 

star-2  requests  SBC-&  in  cluster  Al  of  star  1  (process  P2) , 

{8>.i, j  ,k,£>l} .   Let's  assume  that  in  time  T   the  two  request 

processes  PI  and  P2  progress  to  state  No.  3  (Fig.  3.12). 

At  this  point  of  execution,  the  processes  PI,  P2  are  holding 

the  following  resources: 

PI:   {RPC-DC-A1,  ICAAM-DC-A1,  CSRA/CCB1,  DAC-A1,  CIC-A1} 

P2:   {RPC-DC-A2,  ICAAM-DC-A2,  CSRA/CCA2,  DAC-A2,  CIC-A2} 

At  this  point  of  execution,  each  process  requests  the  DAC 
located  in  the  other  distributed  controller.   But  the  two 
DACs  are  held  by  the  requesting  processes  and  are  unavailable, 
It  seems  that  we  have  a  deadly  embrace  situation  (deadlock). 

The  DAC  is  designed  to  avoid  such  a  case.   One 
of  the  DAC  (which  will  be  called  the  first  DAC  depending  upon 
the  time  of  arrival  of  the  requests)  will  suspend  the  lock 
of  the  second  DAC,  by  releasing  some  of  the  resources  that 
are  held  by  the  second  requesting  process.   This  way  the 
first  requesting  process  will  be  advanced  while  the  second 
will  be  suspended  and  wait  for  the  first  process  to  terminate 
This  deadlock  could  happen  if  the  suspend  lock  method  is  not 
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used  when  the  two  requesting  clusters  are  located  in  differ- 
ent stars  because  the  two  spinning  locks  of  the  two  central 
controllers  are  not  synchronized.   Therefore,  the  spinning 
lock  function  is  limited  for  inter-star  communication.   This 
is  the  reason  for  having  two  types  of  deadlock  avoidance 
methods.   The  suspend  lock  method  is  used  to  prevent  dead- 
lock for  inter-star  communication.   The  issue  of  synchronizing 
the  spinning  locks  of  the  different  central  controllers  of  a 
multi-star  system  is  not  desirable  for  fault  tolerance,  and 
sometimes  it  may  not  be  possible  to  synchronize  them. 

The  second  method  of  deadlock  avoidance  is  the 
"spinning  lock"  method.   This  method  is  used  to  prevent 
deadlocks  which  may  occur  in  inter-cluster  or  intra-cluster 
communication  within  the  same  star.   If  for  any  reason  th'is 
method  fails  to  prevent  a  deadlock,  the  "suspend  lock"  method 
will  take  over  and  prevent  the  deadlock.   The  reason  for 
using  two  different  methods  is  to  reduce  the  overhead  created 
by  the  suspend  method  and  to  increase  fault  tolerance. 

CLK-2  in  the  central  controller  is  a  four-phase 

anti-coincidence  clock  as  shown  in  Fig.  3.22.   This  clock  is 

the  "spinning  lock"  generator. 

e.   Example  #5  -  Deadlock  Avoidance  II  - 
Spinning  Lock  •  (Fig.  3.12) 

Let  us  assume  that  SBC-i  in  cluster  A  requests 

SBC-j  in  cluster  B  and  SBC-k  in  cluster  B  requests  SBC-£  in 

cluster  A.   These  requests  are  all  for  SBCs  residing  in  the 

same  "star."   If  the  two  requests  are  sent  simultaneously  to 
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the  CSRA  of  CCA  and  CSRA  of  CCB,  respectively,  of  the  central 
controller,  they  eventually  will  progress  to  the  deadlock 
condition  as  explained  in  Example  #4.   In  order  to  prevent 
such  possibility,  the  CSRA  of  the  central  controller  is 
designed  with  two  "lock  in  request"  phases. 

1)  The  first  phase  is  implemented  by  the  rotating 
priority  arbiter. 

2)  The  request  selected  by  the  first  arbiter  propagates 
to  the  "spinning  lock"  circuit  which  will  lock  on 
the  request  only  when  CLK-2  goes  low. 

CLK  has  four  phases.   Since  only  one  goes  low  at  any  given 
time,  it  is  impossible  for  both  requests  to  leave  the  central 
controller  at  the  same  time  to  the  distributed  controller  of 
the  requested  cluster  and  thus  eliminates  the  race  condition 
and  deadlock.   A  race  condition  occurs  when  the  scheduling 
of  two  processes  is  so  critical  that  the  various  orders  of 
scheduling  them  result  in  different  processing  [192] .   The 
minimum  time  difference  caused  by  the  spinning  lock  to  the 
requesting  process  is  equal  to  the  anti-coincidence  time  tac 
of  CLK-2  (Fig.  3.22) . 

6.   Multibus  Communication 

Two  arbitration  circuits  are  used  in  the  Multibus 
communication:   the  on-board  SBC  arbiter  called  Bus  Arbiter 
and  the  RPC  of  the  distributed  controller. 

The  Bus  Arbiter  provides  several  resolving  techniques 
based  on  a  priority  concept  that  at  a  given  time  one  SBC  will 
have  priority  above  all  the  rest.   The  RPC  can  be  regarded  as 
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a  parallel  priority  resolver.   A  parallel  priority  resolving 
technique  has  a  separate  bus  request  BREQ  line  for  each  arb- 
iter on  the  system  bus  (Multibus) ,   Several  BREQ  lines  enter 
to  the  RPC  input.   For  each  BREQ  line,  there  is  a  correspond- 
ing BPRN  (bus  priority  in)  line  at  the  output  of  the  RPC. 
Only  one  BPRN  signal  can  be  activated  at  any  given  time. 
This  signal  BPRN  is  returned  to  the  highest  priority  request- 
ing bus  arbiter.   The  bus  arbiter  receiving  priority  (BPRN 
active  low)  then  allows  its  associated  SBC  onto  the  multi- 
master  system  bus,  as  soon  as  the  bus  becomes  available  (i.e., 
it  is  no  longer  busy) .   When  one  bus  arbiter  gains  priority 
over  another  arbiter,  it  cannot  immediately  seize  the  bus. 
It  must  wait  until  the  present  bus  occupant  completes  its 
transfer  cycle.   Upon  completing  its  transfer  cycle,  the 
present  bus  occupant  recognizes  that  it  no  longer  has  priority 
(BPRN  goes  high)  and  surrenders  the  bus,  releasing  the  Busy 
signal.   Busy  is  an  "active  low"  signal  line  which  goes  to 
every  bus  arbiter  on  the  system  bus  and  is  tied  with  other  busy 
signals  by  a  "OR"  gate.   When  the  "Busy"  goes  high,  the 
arbiter  which  presently  has  bus  priority   (BPRN  active  low) 
then  seizes  the  bus  and  pulls  "Busy"  low  to  keep  other  arb- 
iters off.  the  bus.   (See  waveform  timing  diagram,  Fig.  3.13.) 
Note  that  all  multi-master  system  bus  transactions  are  syn- 
chronized to  the  bus  clock  (BCLK) .   This  gives  to  the  parallel 
priority  resolving  circuit  time  to  settle  and  make  a  correct 
decision.   Fig.  3.14  depicts  the  interconnections  between  the 

bus  arbiters  and  the  RPC. 
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Fig.  3.13  Timing  Diagram  of  Bus  Arbiter  and  Randoir  Priority 
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In  our  configuration,  every  master  currently  using 
the  bus  will  surrender  the  bus  upon  completing  its  transfer 
cycle  (unless  a  bus  lock  is  executed) .   This  property  is 
accomplished  by  tying  all  CBREQ  (common  bus  request)  lines  o 
of  all  bus  arbiters  to  ground.   CBREQ  is  an  active  low  signal 
which  indicates  to  the  current  master  on  the  bus  that  the  bus 
has  been  requested  by  another  master. 

Two  other  signals,  LOCK  and  CRQLCK,  lend  to  the  flex- 
ibility of  the  bus  arbiter  within  the  system  configuration. 
LOCK  is  a  signal  generated  by  the  processor  to  prevent  the 
bus  arbiter  from  surrendering  the  multi-master  system  bus  to 
any  other  master,  either  higher  or  lower  priority.   CRQLCK 
(common  request  lock)  serves  to  prevent  the  bus  arbiter  from 
surrendering  the  bus  to  a  lower  priority  bus  master  when  con- 
ditions warrant  it.   LOCK  is  used  for  implementing  software 
semaphores  for  critical  code  section  and  real  time  critical 
events  (such  as  memory  refresh  or  hard  disc  transfer) . 

In  the  three  different  types  of  communications  we 
referred  to  the  term  PRN  and  REQ  chains.   The  following  state 
diagrams  depict  those  chains: 

1)   Intra-cluster  communications 


BREQ 


BPRN 

212 


2)   Inter-cluster  communications 


CLREQ 


BPRN 


CLPRN* 


3)   Inter-star  communication 
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D.   PRESENTATION  OF  RESULTS 
1.   Introduction 

The  important  hardware  components  developed  in  this 
thesis  to  support  this  multiple  microcomputer  system  are  the 
following : 

Interconnection: 

Intra-cluster  --  Multibus 

Inter-cluster  --  Complete-Star  Bus  Switch  Network 
Intercommunication  Control  (three  levels) : 
Random-Priority  Controller 
Distributed  Controller 
Central  Controller 

In  this  section,  we  will  present  representative  test 
results  to  answer  two  major  questions. 
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1)  Did  our  design  work? 

2)  How  well  did  it  work? 

Since  the  Multibus  is  developed  by  Intel  and  is  well  docu- 
mented [196],  we  decided  not  to  report  its  operations  here. 
We  will  describe  the  operational  results  of  the  bus  switch 
and  the  three  levels  of  intercommunication  control. 

How  well  they  work  together  in  a  computational 
environment  will  be  reported  in  Chapter  IV  where  the  imple- 
mentation of  an  adaptive  spatial  filter  on  the  multiple 
microcomputer  system  will  be  described. 
2.   Bus  Switches 

The  function  of  a  bus  switch  is  to  transmit  a  signal 
from  the  Multibus  in  one  cluster  to  the  Multibus  in  another 
Cluster.   For  four  clusters,  the  "complete  star  bus  switch 
network"  designed  has  six  branches  of  bus  switches  as  shown 
in  Fig.  3.7.   Although  the  Intel's  Multibus  has  86  lines, 
we  decided  that  only  58  of  them  need  to  be  switched  to 
facilitate  communication  between  two  SBCs  from  different 
clusters.   Therefore,  one  "bus  switch"  includes  appropriate 
circuits  to  transmit  58  signals,  including  data,  address  and 
control  signals. 

Four  figures  will  be  used  to  describe  the  behavior 
of  the  bus  switch.   The  first  three  figures  are  used  to  show 
the  improvement  of  signal  waveform  before  and  after  the  bus 
switch.   The  signals  shown  are  the  following: 
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One  data  bit  -  Fig.  3.15a 
One  address  bit  -  Fig.  3.15b 
One  control  signal  -  Fig.  3.15c 

Each  figure  consists  of  two  traces.   The  top  trace  shows  the 
waveform  before  the  switch.   The  lower  trace  shows  the  wave- 
form after  the  switch.   It  can  be  seen  that  in  all  three 
cases  the  waveforms  after  the  switch  are  better  because  their 
rise  times  are  all  shorter,  giving  a  sharper  pulse.   It  is 
interesting  to  note  the  noise  appearing  on  these  three  signals. 
They  are  typical  in  the  real  operational  environment.   It 
should  be  noted  that  the  control  signal  in  Fig.  3.15c  is  the 
Acknowledge  Signal  (XACK)  generated  by  the  SBC  requesting  the 
use  of  the  system  bus. 

The  behavicr  of  the  bus  switch  is  described  also  by 
Fig.  3.20  which  shows  the  delay  of  the  switch.   Again,  the 
top  trace  is  before  the  switch,  the  bottom  trace  is  after  the 
switch.   The  delay  is  no  more  than  25  nsec. 

These  four  figures  demonstrated  that  our  bus  switches 
are  adequate  to  provide  communication  between  two  Multibuses 
running  at  10  MHZ. 

3.   Random  Priority  Controllers  (RPC) 

The  function  of  random  priority  controllers  is  to 
arbitrate  the  requests  of  bus  usage  from  many  SBCs,  either 
from  the  same  cluster  or  from  several  clusters.   If  an  SBC 
from  another  cluster  wants  the  Multibus  to  communicate  either 
with  another  SBC  or  with  the  Global  RAM,  two  higher  level 
controllers  -  the  central  controller  and  two  distributed 
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A  data   Bit 


Fig.   3.15a 


An  Address   bit 


Fig.   3.15b 


A  control  signal 
"Acknowledge"  (XACK) 

Fig.  3.15c 


Figure  3.15  The  input  and  output  waveforms  of  three 
selected  signals  to  demonstrate  the 
performance  of  bus  switch 

Top  trace:  Input  to  the  bus  switch 
Bottom  trace:  Output  of  the  bus  switch 
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controllers  associated  with  this  cluster  and  the  other  clus- 
ter where  the  requesting  SBC  resides  -  must  also  participate 
in  the  control  function.   However,  the  control  ultimately 
came  to  the  RPC  because  it  is  the  circuit  which  grants  the 
bus  usage  signal,  BPRN  (Bus  Priority  In),   One  RPC  is  used 
for  every  Multibus.   So  there  are  four  RPCs  in  each  star. 
The  behavior  of  our  RPC  will  be  described  by  four 
figures  using  the  BPRN  signals  (Bus  Priority  In)  of  the  SBCs 
requesting  the  bus.   A  BPRN  low  signal  means  the  SBC  has  been 
granted  the  bus  and  is  using  it. 

a.  Sharing  of  the  Multibus  by  Two  SBCs. 

Fig.  3.16  shows  BPRNs  of  two  SBCs.   The  bus  usage 
pattern  was  created  by  software.   Each  unit  of  low  BPRN  rep- 
resents a  transfer  of  one  word.   If  there  is  no  request  of 
bus  usage  by  other  SBCs,  the  SBC  currently  using  the  bus  will 
hold,  as  shown  by  the  BPRN  low  signal  for  a  longer  period  of 
time.   The  figure  shows  the  interleaving  of  bus  usages  by 
these  two  SBCs,  indicating  that  the  RPC  works  rapidly  and 
efficiently  to  serve  these  two  SBCs. 

b.  Slow-Down  of  Bus  Release  Due  to  Refresh 
of  Dynamic  RAM 

However,  we  discovered  that  the  SBC  using  the 
bus  may  not  release  the  bus  after  its  one  word  of  transfer, 
as  shown  by  a  wide  gap  in  Fig.  3.17,  although  the  other  SBC 
was  requesting  the  bus.   We  discovered  that  this  is  the  na- 
ture of  Intel's  8612  design.   When  the  dynamic  RAM  is  being 
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BPRN  of  SBC1 
BPRN  of  SBC2 


Figure  3. 16  Bus  Priority  In  signals  of  two 
the  arbitration  of  their  usage 
random  priority  controller 


SBCs  to  demonstrate 
of  the  bus  by  the 


BPRN  of  SBC1 
BPRN  of  SBC2 


Figure  3.17  Bus  Priority  In  signals  of  two  SBCs  to  demonstrate 
the  effect  of  dynamic  RAM  refresh  on  the  bus  usage 
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Figure  3.18  Bus  Priority  In  signals  of  four  SBCs  to  demonstrate 
the  arbitration  of  their  usage  of  the  bus  by  the 
random  priority  controller 
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refreshed,  the  SBC  will  not  release  the  bus.   This  is  a 
drawback  we  cannot  do  anything  about  except  to  redesign  the 
8612  SBC. 

c.  Sharing  of  Multibus  by  Four  SBCs 

Fig,  3.21  shows  the  BPRN  signals  of  four  SBCs. 
Their  general  patterns  are  similar,  in  the  sense  that  there 
is  no  large  gap  in  any  one  of  these  traces  indicating  no  SBC 
is  dominating  the  bus  and  none  is  being  left  out  either. 
This  "uniform"  and  "equal"  treatment  of  all  SBCs  requesting 
the  bus  is  exactly  what  the  RPC  is  designed  to  do. 

d.  Behavior  of  RPC  When  the  Bus  is  Saturated 

We  prepared  the  most  severe  test  for  the  RPC  by 
programming  four  SBCs  requesting  the  bus  all  the  time.   Of 
course,  in  real  applications,  this  condition  should  never  be 
allowed  to  happen.   It  represents  very  poor  application  pro- 
gramming.  However,  it  is  a  tough  test  for  the  RPC.   Fig.  3.19 
shows   the  BPRN  of  four  SBCs.   The  interleaving  of  bus  usage 
is  no  different  from  the  previous  three  figures.   However, 
it  is  important  to  note  that  the  bus  was  first  shared  by  SBC1 
and  SBC3  for  12  transfers  and  then  shared  by  SBC2  and  SBC4 
for  another  12  transfers,  followed  by  the  repetition  of  such 
a  pattern.   Two  important  properties  caused  this  pattern. 
First,  the  RPC  is  designed  based  on  a  binary  tree  selection. 
Therefore,  only  two  SBCs  will  be  granted  first,  followed  by 
another  pair.   Second,  the  12  transfers  between  SBC1  and  SBC3 
are  determined  by  the  basic  design  of  the  8686  instruction 
queue  which  has  a  FIFO  queue  of  six  instructions. 
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Figure  3.19  Bus  Priority  In  signals  of  four  SBCs  which 
request  the  bus  usage  100%  of  the  time  to 
demonstrate  the  function  of  random  priority 
controller 


Input  signal  to  a  bus 
switch 


Output  signal  waveform 
from  a  bus  switch 


Figure  3.20 


Waveforms  of  input  and  output  signal  of  a 
bus  switch  to  demonstrate  the  operation 
of  the  switch 
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Figure  3.21  Bus  Priority  In  signal  of  four  micro- 
computers requesting  20%  usage  of  the 
Multibus  to  demonstrate  the  operation 
of  the  random  priority  controller  in 
this  example  of  heavy  bus  requests 
(80%  bus  request) 


221 


This  demonstration  clearly  indicated  that  our 
RPC  is  able  to  arbitrate  four  SBCs  under  the  most  demanding 
bus  contention  situation  which  should  never  be  allowed  to 
occur  in  real  application. 
4.   Central  Controller 

The  function  of  the  central  controller  is  to  arbitrate 
requests  for  inter-cluster  and  inter-star  communication.   It 
works  jointly  with  the  distributed  controllers  to  search, 
select  and  synchronize  these  requests.   Although  there  is  only 
one  central  controller  for  a  star,  it  has  four  sections,  one 
for  each  cluster  in  the  star. 

The  important  components  of  each  section  in  the 
central  controller  are  CSRA  and  CSPE.   All  four  sections  are 
synchronized  by  two  clocks:   CLK1  for  the  searching  and  se- 
lecting of  requests,  CLK2  for  their  synchronization. 

Two  figures  will  be  used  to  demonstrate  their  oper- 
ations. 

a.   Searching/Selecting  Clock  (CLK1)  and 
Synchronization  Clock  (CLK2) 

These  two  clocks  are  the  heart  beats  of  the  inter- 
communication network.   It  should  be  realized  that  CLK2  is 
not  independent  because  it  is  generated  from  CLK1.   Fig.  3.22 
shows  their  mutual  relationship.   The  third  trace  is  CLK1 . 
Below  it  are  the  four-phase  CLK2  signals  for  four  clusters. 
It  is  important  to  note  that  there  is  no  overlap  among  them. 
This  is  to  avoid  any  undesirable  coincidence.   CLK1  is  at  a 
higher  clock  frequency  such  that  all  requests  from  other 
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clusters  and  stars  are  searched  and  selected  at  adequate 
rates.   Once  a  request  is  selected,  it  is  synchronized  by 
CLK2  and  sent  on  to  the  appropriate  cluster. 

b.   Searching  and  Selection  of  Requests 

Fig.  3.23  shows  the  functions  of  CSRA  and  CSPE 
circuits  of  the  central  controller  A.   Four  signals  are  shown 
in  the  top  half  of  the  figure  representing  three  cluster 
requests  from  clusters  B,  C,  D  and  from  the  cluster  A  of 
another  star,  respectively.   The  lower  half  of  this  figure 
shows  the  cluster  or  star  grant  signals  to  another  star, 
cluster  D,  C  and  B,  respectively.   It  is  important  to  note 
that  these  CLPRN  (or  STPRN)  signals  do  not  overlap  although 
the  request  signals  do  overlap.   It  can  be  seen  that  cluster 
C  sent  its  CLREQ  first  and  got  its  CLPRN.   However,  cluster 
D  sent  its  CLREQ  before  cluster  C  finishes  its  request.  Such 
an  occasion  is  generally  not  allowed  in  real  application 
because  any  SBC  is  allowed  to  transfer  one  word  of  data  and 
must  release  the  bus  only  if  a  software  bus  lock  is  ordered. 
However,  this  test  is  to  challenge  the  ability  of  the  central 
controller.   In  this  case,  the  CSRA/CSPE  of  the  CCA  will  allow 
the  cluster  A  to  complete  its  request  period  and  then  award 
a  CLPRN  to  cluster  D.   This  figure  clearly  demonstrated  that 
with  a  mix  of  cluster  request  signals  from  three  clusters  and 
one  star,  some  with  overlap,  some  without  overlap,  the  central 
controller  is  able  to  take  in  these  requests,  sort  them  out, 
select  one  at  a  time  and  award  "cluster  grant"  appropriately. 
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CLK1:  For  Searching 
and  Selection 


CLK2:  4  Phase  Clock 

For  Synchronization 


Figure  3.22  Two  Clocks  In  Central  Controller  For 

Searching/Selection  and  Synchronization 
of  Requests  From  Stars  and  Clusters 


Four  Request  Signals  To  CSRA 

From  DCB 
From  DCC 
From  DCD 
From  Star  A 


Four  Priority  In  Signals 
From  CSPE: 

To  Star  A 
To  DCD 
To  DCC 
To  DCB 


Figure  3.23  Demonstration  of  the  Functions  of  CSRA  and 
CSPE  Circuits  in  the  Central  Controller 
(Section  A  for  Cluster  A) 
Input  to  CSRA,  Output  from  CSPE 
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Of  course,  this  is  not  the  completion  of  the  intercommunica- 
tion task.   The  CLPRN  will  be  sent  to  the  distributed  con- 
troller to  initiate  further  control  actions  to  complete  the 
total  task  of  communication  between  two  SBCs. 
5.   Distributed  Controller 

The  function  of  the  distributed  controller  is  the 
same  as  that  of  the  central  controller.   They  must  work  with 
the  RPC  to  complete  the  intercommunication.   The  central 
controller  is  located  away  from  the  Multibus  and  also  controls 
the  operations  of  all  bus  switches.   The  distributed  control- 
ler is  mounted  on  the  Multibus.   Therefore,  we  have  four 
distributed  controllers  in  a  star.   The  important  components 
of  each  distributed  controller  are: 

ICAAM  (Intra-cluster  advanced  activities  monitor) 
CIC  (Coincidence  inhibitor  circuit) 
DAC  (Deadlock  avoidance  circuit) 

Four  figures  will  be  used  to  demonstrate  their  operations. 

Eight  control  signals  in  the  distributed  controller  are  used 

in  these  figures. 

BREQ 

CLREQ* 

Internal/External  Signal 

Inhibit 

PRE 

BHD 

CLPRN 

BPRN 

The  first  and  eighth  control  signals,  BREQ  and  BPRN, 

are  two  of  the  most  important  ones  because  they  are  directly 

connected  to  the  SBCs.   We  must  remember  that  all  the  buses, 
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switches,  controllers  are  supporting  circuits  to  help  the 
SBCs  to  compute,  to  talk  among  themselves  efficiently.   The 
SBCs  are  the  originators  and  receivers  of  the  data  and  com- 
munication and  control  signals. 

a.  Intra-Cluster  Communication 

Fig.  3.24  shows  the  sequence  of  events  in  a  test 
case  where  one  SBC  in  a  cluster  wants  to  talk  to  another  SBC 
in  the  same  cluster. 

It  can  be  seen  that  CLREQ*  (second  trace)  is  high, 
which  means  no  request  from  another  cluster.   CLPRN  (7th 
trace)  is  therefore  also  high,  i.e.,  no  cluster  priority- 
signal  is  granted  by  the  central  controller. 

It  is  interesting  to  notice  the  small  delays 
between  BREQ,  PRE  and  BPRN. 

b.  Inter-Cluster/Intra-Star  Communication 

Fig.  3.25  shows  the  sequence  of  events  in  a  test 
case  where  an  SBC  in  one  cluster  wants  to  talk  to  an  SBC  in 
another  cluster  within  the  same  star. 

There  are  several  interesting  points  when  this 

case  is  compared  with  the  intra-cluster  case: 

°  Both  BREQ  and  CLREQ*  exist. 

0  Inhibit  signal  is  active  to  prevent  any  premature 
generation  of  BPRN. 

0  CLPRN  is  also  active  to  respond  to  the  CLREQ*. 

It  is  clearly  seen  that  this  inter-cluster 

communication  has  been  correctly  handled  by  the  distributed 

controller. 
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BREQ 
CLREQ 
INT/EXT 
INH 

PRE 
BHD 
CLPRN 
BPRN 


Figure  3.24  Eight  Control  Signals  to  Demonstrate 

The  Function  of  Distributed  Controller 
For  Arbitration  of  Intra-Star  and 
Intra-Cluster  Communication 

BREQ 
CLREQ 
INT/EXT 
INH 

PRE 
BHD 
CLPRN 
BPRN 


Figure  3.25   Eight  Control  Signals  to  Demonstrate 

The  Function  of  Distributed  Controller 
For  Arbitration  of  Intra-Star  and 
Inter-Cluster  Communication 


BREQ 
STREQ* 
INT/EXT 
INH 

PRE 
BHD 
STPRN 
BPRN 


Figure  3.26  Eight  Control  Signals  to  Demonstrate 

The  Function  of  Distributed  Controller 
For  Arbitration  of  Inter-Star  Commu- 
nication      22_ 


c.   Inter-Star  Communication 

Figure  3,26  shows  the  sequence  of  events  in  a 

test  case  where  an  SBC  in  one  cluster  of  a  star  wants  to  talk 

to  an  SBC  in  the  corresponding  cluster  of  a  neighboring  star. 

They  are  quite  similar  to  the  inter-cluster/intra-star  case 

in  Fig.  3.25  with  several  changes. 

The  second  trace  is  now  the  STREQ*  instead  of  the 
CLREQ*  signal. 

The  seventh  trace  is  now  the  STPRN  signal  instead 
of  the  CLPRN  signal. 

The  rest  of  the  signals  behave  quite  similarly.   It  shows 
that  requests  from  a  cluster  in  the  same  star  and  from  a 
neighboring  star  are  treated  quite  the  same. 
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IV.   IMPLEMENTATION  OF  ADAPTIVE  FILTER 
ON  MULTIPLE  MICROCOMPUTER  SYSTEM 


A.   INTRODUCTION 

1.   Selection  of  Microcomputer 

The  goal  of  this  thesis  research  was  to  eliminate 
the  gap  between  the  theoretical  development  of  image  process- 
ing algorithms  and  the  experimental  development  of  their 
implementation  on  some  processor  systems  which  are  good  can- 
didates for  practical  applications. 

In  this  thesis,  a  multiple  microcomputer  system  was 
chosen  as  the  processor  system  candidate. 

It  should  be  recognized  that  only  during  the  past 
two  to  three  years  have  16  bit  microcomputers  been  seriously 
considered  for  signal  processing  implementations.   Although 
8  bit  microcomputers  have  been  investigated  for  performing 
signal  processing  operations,  the  motivations  of  these  stud- 
ies are  mainly  to  explore  what  can  the  8  bit  microcomputers 
do  for  signal  processing.   For  serious  implementations,  bit 
slice  microprocessors  have  always  been  the  favored  approach 
which  can  be  designed  to  emulate  16  bit,  32  bit  or  even 
longer  word  computers.   However,  16  bit  microcomputers  are 
being  supported  with  more  and  more  powerful  hardware  and 
software  and  are  approaching  low- end  minicomputer  performance 

To  examine  the  signal  processing  performance  of 
today's  16  bit  MOS  microcomputer,  we  coded  the  statistical 
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3x3  spatial  filter  on  one  main  frame  computer,  IBM  360/67 
and  two  16  bit  microcomputers,  DEC  LSI-11  and  Intel  8612, 
using  high  order  programming  languages  and  single  precision 
numerical  data  format.   Fortran  is  used  for  the  IBM  and  DEC 
computers.   PLM86  is  used  for  the  Intel  computer.   The  exe- 
cution times  expressed  in  seconds  are  shown  in  Table  IV. 1 
for  comparison. 

TABLE  IV. 1 

IMAGE  PROCESSING  EXECUTION  TIME 
(in  seconds) 


Image  Processing  Operations 

IBM  360/67 

DEC  LSI-11 

Intel  8612 

Fortran 

Fortran 

PLM86 

Macro 

Single 
Precision 

Single 
Precision 

Single 
Precision 

Integer 

Spatial  Statistics  Calculation 

4.07 

25.46 

334.25 

0.72 

Spatial  Filter  Design 

0.0047 

0.24 

2.82 

Perform  Spatial  Filter 

0.98 

5.62 

79.8 

0.47 

It  can  be  seen  that  LSI-11  has  better  floating  point  compu- 
tation support  today  than  Intel's  8612  which  took  13  to  14  times 
longer  than  the  LSI-11   to  perform  these  image  processing  oper- 
ations.  The  LSI-11  itself  took  approximately  6  times  longer  than 
the  IBM  360/6.7.   -Based  on  this  comparison,  the  LSI-.11  should 
be  chosen  as  the  16  bit  microcomputer  candidate.   However, 
Intel's  8612  was  selected  because  of  its  larger  physical 
memory  addressing  space  and  its  system  Multibus  support  which 
are  much  better  suited  for  multiple  microcomputer  system 

development . 
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Further,  two  of  the  three  spatial  filter  modules  were 
coded  in  assembly  language  and  a  32  bit  integer  data  format 
on  the  8612.   It  was  found  that  the  execution  times  are  quite 
short,  suggesting  that  even  today's  Intel  16  bit  microcomputer, 
without  the  assistance  of  hardware  arithmatic  devices,  can 
perform  these  rather  sophisticated  image  processing  operations 
very  well  if  compared  with  the  main  frame  computer  IBM  360/67. 
More  specifically,  it  took  0.72  seconds  to  compute  the  auto- 
correlation matrix  elements  for  the  3x3  spatial  filter, 
averaged  over  the  32  x  32  image,  and  0.47  seconds  to  perform 
this  3x3  spatial  filtering  over  the  image. 
2 .   Implementation 

In  this  chapter  we  will  present  the  implementation 
results  of  our  adaptive  filter  on  the  multiple  microcomputer 
system.   In  Section  B,  the  performance  of  spatial  filters  is 
discussed.   In  Section  C,  the  performance  of  adaptive  spatial 
filters  will  be  discussed. 

The  functions  of  various  components  of  the  intercon- 
nections and  communication  controllers  have  been  described  in 
previous  sections  using  mainly  signals  generated  by  function 
generators.   In  this  section,  a  test  program  was  used  to  test 
and  evaluate  the  data  transfer  behaviors  of  the  system.   This 
program  is  quite  straightforward  and  fetches  data  from  the  RAM 
and  displays  them  on  a  CRT  terminal.   However,  the  locations 
of  the  program  and  data  are  at  different  parts  of  the  system 
to  provide  a  thorough  test  of  the  data  transfer  and  bus 

arbitration  behaviors. 
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Three  tests  were  made. 

The  objectives  of  the  first  two  tests  are  to  measure 
the  maximum  rate  of  data  transfer  on  the  system  bus.   For 
this  purpose,  both  the  program  and  data  were  stored  either 
in  the  global  RAM  located  in  another  slave  SBC,  as  in  test 
case  1,  or  in  the  global  RAM  located  in  the  yPRO  RAM  board. 
Therefore,  the  system  bus  was  used  very  busily  because  not 
only  the  data  must  be  fetched  via  the  bus,  the  program  itself 
must  be  read  from  the  memory  external  to  the  testing  SBC. 


TABLE  IV. 2 
MEMORY  ALLOCATION  FOR  MULTIBUS  TEST 


Test  No. 


Location  of 
Program 


Location  of 
Data 


Remarks 


Slave  SBC 
yPRO  RAM 
Master  SBC 


Slave  SBC 
yPRO  RAM 
yPRO  RAM 


Program  and  data 
being  run  at  maxi- 
mum rate. 

Program  and  data 
being  run  at  approx- 
imately 20%  of  the 
maximum  rate. 


The  maximum  rates  at  which  this  test  can  run  with 
one  to  six  microcomputers  are  shown  in  Table  IV. 3.   Several 
important  facts  can  be  noticed. 

(1)  The  bus  transfer  rate  of  each  SBC  is  reduced 
when  more  and  more  SBCs  want  to  use  the  bus,  as  it  should  be. 

(2)  However,  the  maximum  rate  and  amount  of  reduc- 
tion vary  from  test  to  test.   For  example,  in  test  1,  we 
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were  able  to  transfer  710  Kbyte/sec  at  its  maximum  if  only 
one  SBC  is  using  the  bus  as  compared  with  a  maximum  of  911 
Kbyte/sec  rate  for  one  SBC  in  test  case  2.   Test  2  showed 
that  it  is  quicker  to  get  data  out  of  the  uPRO  than  the  RAM 
on  a  different  SBC.   This  can  be  explained  easily  because 
control  on  the  SBC  must  decide  whether  the  memory  addressed 
is  on-board  or  off-board.   This  decision  takes  time,  thus  it 
slows  down  the  transfer  rate.   When  more  SBCs  were  added  in 
these  two  tests,  the  transfer  rate  of  every  SBC  was  decreased. 
However,  the  rates  of  decrease  were  different  in  Test  1  and 
Test  2  as  shown  in  Table  IV. 3.   They  are  also  plotted  in  Fig. 
4.1  to  give  a  graphical  view.   It  is  obvious  that  substantial 
deteriorations  of  the  bus  transfer  rate  took  place  in  these 
two  cases,  from  710  Kbyte/sec  to  144  Kbyte/sec  in  Test  1  and 
from  911  to  167.1  Kbyte/sec  in  Test  2. 

(3)  It  should  be  pointed  out  that  such  heavy 
usage  of  the  system  bus  should  be  allowed  to  happen  only 
during  tests.   If  a  programmer  prepared  an  application  pro- 
gram with  such  heavy  bus  usage,  he  has  failed  miserably  in 
partitioning  his  program  for  parallel  and  pipeline  computa- 
tion in  the  multiple  microcomputer  system. 

(4)  Therefore,  to  provide  a  test  more  compat- 
ible with  real  operational  conditions,  Test  3  was  prepared 
which  has  its  program  in  the  RAM  of  the  master  SBC  and  its 
data  in  the  global  RAM  in  yPRO.   Further,  it  was  run  at  a  rate 
of  194.9  Kbyte/sec  on  the  bus  when  only  one  SBC  requested 
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the  bus.   It  can  be  seen  that  the  deterioration  of  the  system 
bus  transfer  rate  is  much  more  moderate,  from  194.9  for  one 
SBC  to  132  Kbyte/sec  for  six  SBCs.   This  is  a  testimony  of 
the  ability  of  the  intercommunication  controller  in  treating 
all  SBCs  equally  without  allowing  any  one  SBC  to  dominate  the 
bus  usage. 

TABLE  IV. 3 

SYSTEM  BUS  TRANSFER  RATE  (Kbyte/sec)  FOR  EVERY  SBC  IN 
THREE  MULTIPLE  MICROCOMPUTER  SYSTEM  TESTS 


No. 

of  SBCs 

Test  1 

Test  2 

Test  3 

1 

710 

911 

194.9 

2 

400.7 

522 

188 

3 

277.7 

345.33 

184 

4 

212 

255.7 

166 

5 

171.8 

202.3 

147.9 

6 

144 

167.1 

132 

(5)   Further,  the  overhead  loss  of  transfer  rate 
in  arbitrating  the  bus  usage  of  several  microcomputers  is 
small.   Let  us  consider  Test  Case  2.   The  maximum  bus  trans- 
fer rate  took  place  when  there  were  two  SBCs  using  the  bus 
and  was  2  x  522  =  1044  Kbyte/sec.   When  six  SBCs  were  using 
the  bus,  the  total  transfer  rate  on  the  bus  was  6x167.1  = 
1002.6  Kbyte/sec.   The  loss  is  only  (1044  -  1002 . 6) /1044  = 
0.0397,  or  3.971.   Of  course,  each  SBC  suffered  a  loss  of 
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C911  -  167 . 1) /911  =  81.658%  in  its  bus  usage  rate.   It  is 
interesting  to  note  that  167.1  KBS  for  six  SBCs  is  close  to 
one-sixth  of  the  rate  of  911  KBS  if  one  SBC  has  the  system 
all  to  itself. 

B.   IMPLEMENTATION  OF  3 x  3  SPATIAL  FILTERING  ON 
MULTIPLE  MICROCOMPUTER  SYSTEM 

1.   Introduction 

Four  different  implementations  were  compared. 

They  differed  in  the  manner  of  storing  the  programs,  variables 

and  data  in  various  parts  of  the  memory  hierarchy  and  some 

programming  skills.   For  this  development,  all  program  and 

data  were  stored  in  RAM  on  the  single  board  microcomputers. 

These  RAM  have  been  separated  into  two  types: 

°  Unshared  RAM:   They  are  "private"  to  the  microcomputer 
where  the  RAM  is  located. 

0  Shared  RAM:   They  are  "global"  and  can  be  accessed 
by  other  microcomputers  on  the  same  Multibus. 

TABLE  IV. 4 
PROGRAM  DATA  AND  VARIABLE  ALLOCATION 


Implementation 

Program 

Variables 

Data 

Case  1 

Ideal 

Case 

Case  2* 

Unshared 

Unshared 

Shared 

Case  3 

Unshared 

Unshared 

Shared 

Case  4 

Unshared 

Shared 

Shared 

Case  5 

Shared 

Shared 

Shared 
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The  results  are  presented  in  Fig.  4.2  which  expresses  the 
number  of  frames  which  can  be  performed  on  the  3x3  spatial 
filtering  task  per  second  as  a  function  of  the  number  of 
microcomputers  used  to  partition  the  spatial  filtering  into 
parallel  operations.   It  should  be  pointed  out  that  the  image 
size  is  30  x  30  pixels.   The  partitioning  is  to  split  the 
image  into  equal  parts  for  several  microcomputers. 

The  results  will  be  discussed  in  the  following. 

a.  The  first  case  is  not  a  measured  result.   It 
represents  the  ideal  enhancement  of  computation  by  using 
multiple  microcomputers.   We  first  measured  the  execution 
speed  of  performing  a  spatial  filter  over  the  whole  image 
by  one  microcomputer  with  program,  variables  and  data  all 
in  the  private  unshared  RAM  of  the  SBC.   There  was  no  bus 
usage,  therefore  no  overhead  due  to  bus  communication.   The 
maximum  filtering  speed  is  roughly  two  thousand  pixels  pro^ 
cessed  by  this  spatial  filter  per  second.   For  more  SBCs, 
we  simply  multiply  the  rate  by  the  number  of  microcomputers 
and  plotted  a  "linear  enhancement"  curve.   This  represents 
the  ideal  case  and  serves  as  the  goal  for  our  partitioning 
to  approach. 

b.  Let  us  start  with  the  case  of  lowest  performance, 
Case  5.   In  this  case,  all  program,  variables  and  data  were 
located  in  the  shared  memory  of  another  SBC,   It  obviously 
required  the  maximum  amount  of  transfer  and  system  bus  usage. 
It  can  be  seen  that  the  performance  saturated  quite  quickly. 
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We  are  obviously  wasting  the  computational  power  of  added 
microcomputers . 

c.  Next,  in  Case  4,  where  the  program  was  stored 

in  the  private  memory  of  the  computing  SBC,  but  the  variables 
and  data  were  stored  in  the  global  memory  of  another  SBC. 
The  throughput  performance  improved  almost  linearly  with 
respect  to  the  number  of  microcomputers  but  at  a  rate  lower 
than  the  "ideal  linear  enhancement." 

d.  In  Case  3,  both  the  program  and  variables  were 
stored  in  the  unshared  private  RAM.   But  the  data  were  stored 
in  the  global  RAM  of  another  SBC.   Further  improvement  was 
accomplished.   However,  about  20%  of  the  computing  capability 
was  lost  because  of  the  overhead  for  the  arbitration  of  mul- 
tiple microcomputer  requests. 

e.  In  Case  2,  the  locations  of  the  program,  varia- 
bles and  data  are  the  same  as  in  Case  3,  but  the  programming 
is  more  clever  in  the  sense  that  the  number  of  accesses  to 

the  system  bus  by  each  microcomputer  is  minimized  and,  further, 
the  occurrences  of  these  system  bus  accesses  were  distributed 
as  evenly  in  time  as  possible.   It  can  be  seen  that  the  en- 
hancement of  total  computing  power  is  much  closer  to  the  total 
"ideal  linear  enhancement"  case. 

f.  In  summary,  we  have  used  the  special  case  of  spa- 
tial filtering  to  explore  the  behavior  and  improvement  of 
computing  by  the  multiple  microcomputer  system.   It  should 

be  pointed  out  that  although  there  have  been  a  lot  of  ideas 
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in  this  field,  real  experience  is  still  very  limited.   Con- 
sequently, there  is  really  no  concensus  in  the  philosophy, 
approaches  and  methodologies  of  effective  partitioning  for 
parallel  and  pipeline  computing.   This  thesis  is  a  first  step 
in  testing  the  uncharted  water.   We  only  used  a  spatial  filter 
to  test  the  parallel  processing.   We  have  not  used  a  problem 
to  test  pipeline  processing  and  combined  parallel/pipeline 
processing  yet.   Therefore,  we  do  not  intend  to  declare  that 
the  experience  learned  from  this  spatial  filtering  established 
a  general  methodology  for  effective  partitioning. 

But  we  feel  that  the  following  guidelines  proba- 
bly will  be  helpful  when  more  complex  problems  will  be  tested 
to  develop  a  more  thorough  philosophy  of  partitioning: 

a)  The  bus  usage  should  be  minimized. 

b)  The  bus  usage  should  be  distributed  more  evenly 
in  time.   Concentration  of  bus  usage  should  be 
avoided. 

g.   Meanwhile,  it  should  be  pointed  out  that  this 

implementation  of  spatial  filtering  is  a  test  case  based  on 

a  real  computation  problem.   In  addition  to  the  experience 

learned  for  partitioning,  the  successful  implementation  of 

the  spatial  filtering  involving  up  to  five  microcomputers  in 

parallel  processing  convincingly  proved  that  the  random 

priority  is  working  correctly. 
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V.   CONCLUSION  AND  RECOMMENDATIONS 

A.   CONCLUSION 

1.  Motivation 

This  thesis  was  motivated  by  the  needs  of  new  smart 
sensor  developments.   With  the  anticipation  of  new  sensitive 
and  large  mosaic  optical  sensor  arrays  and  very  sophisticated 
signal/data  processing  capabilities  to  be  offered  by  VLSI/ 
VHSIC  electronics,  very  ambitious  mission  objectives  of  new 
surveillance,  search/track  and  weapon  guidance  systems  are 
being  proposed  and  developed,  which  require  new  signal  pro- 
cessing techniques  to  accomplish  demanding  goals.   Further, 
they  require  very  sophisticated  processor  systems  which  are 
powerful  enough  to  implement  the  new  signal  processing 
algorithms  and  also  small  and  light  enough  for  mounting  on 
platforms  of  practical  systems. 

2.  Single  Objective  and  Dual  Tasks 

This  thesis  has  one  single  objective,  to  help  to 
make  the  new  "smart  sensors"  practical,  but  consists  of  two 
tasks  to  achieve  this  objective. 

a.  Develop  new  adaptive  filter  techniques  to  process 
infrared  images  for  enhancement  of  "target  signal" 
to  "background  clutter  noise"  ratio, 

b.  Develop  a  new  multiple  microcomputer  system  to 
implement  this  type  of  image  processing. 
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3.  Extensions  and  Contributions 

Both  studies,  although  motivated  by  the  development 
of  "infrared  smart  sensors,"  are  generic  and  can  contribute 
to  broader  fields  much  beyond  the  image  processing  problems 
in  infrared  smart  sensor  systems. 

4.  Results  I  -  Adaptive  Filters 

The  following  results  have  been  obtained: 

a.  Adaptive  filter  research  done  in  the  past  was 

surveyed.   It  was  found  that: 

0  Practically  all  past  research  dealt  with  one  dimen- 
sional problems,  except  one  by  B.  Evenor  who  extended 
the  LMS  algorithm  to  images  generated  by  Markov  models. 

0  Most  approaches  are  based  on  LMS  algorithms. 

b.  In  this  thesis  the  LMS  algorithm  was  extended  to 
process  real  world  infrared  images. 

c.  A  new  approach  to  nonrecursive  adaptive  filters 
was  developed  which  is  similar  to  searching  for  the  extreme 
point  in  optimization  problems. 

d.  Two  optimization  criteria  were  considered: 

mMSE  =  minimization  of  mean  square  error 
MSNR  =  maximization  of  signal  to  noise  ratio. 

e.  Seven  different  optimization/searching  techniques 

were  developed: 

°  Gradient  approaches  =   [steepest  descent 

\   Accelerated  steepest  descent 
(^  Amir's  method  (mMSE  only) 


Fletcher -Reeves 
Pollack 


0  Conjugate  gradient  approaches  = 

0  Variable  metric  approach  -  Davidon-Fletcher-Powell 
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0  Amir's  transform  approach  (MSNR  only) 

f.   These  approaches  were  tested  on  two  infrared  test 

images : 

0  Indiana  -  Blue  spike  band  infrared  image  appropriate 
for  high  altitude  downward  looking  infrared  sensor 
systems. 

0  China  Lake  -  10-13  micron  thermal  band  infrared  image 
appropriate  for  shorter  distance  side-looking  infrared 
sensor  systems. 

The  results  are  encouraging  and  showed  that  these  new 

adaptive  filters  are  effective  in  suppressing  background  clutter 

and  enhancing  the  "target  signal"  to  "clutter  noise  ratio." 

5.   Results  II  -  Multiple  Microcomputer  System 

a.  The  tightly-coupled  multiple  microcomputer  research 

done  in  the  past  was  surveyed.   It  was  found  that: 

0  There  are  many  conceptual  designs  of  new  multiple 
microcomputer  systems.   Only  a  very  small  number  of 
these  have  embarked  on  actual  developments  with  both 
hardware  and  software  efforts. 

0  More  loosely  coupled  multiple  microcomputer  systems 

are  being  developed.   They  are  mostly  computer  networks. 

0  There  are  only  two  tightly  coupled  multiple  micro- 
computer systems  in  operation  today  based  on  the 
survey  of  the  open  literature.   Both  are  at  Carnegie 
Mellon  University:   Cmmp  and  Cm*.   It  should  be  noted 
that  although  Cmmp  is  a  multiple  minicomputer  system, 
today's  16  bit  microcomputers  are  fast  approaching 
minicomputer  performance. 

b.  Based  on  an  intensive  consideration  of  the  re- 
quirements of  typical  new  smart  sensor  systems  in  not  only 
the  mission  signal  processing  area  but  also  in  management, 
control,  and  communication  areas,  it  was  decided  that  a 
hierarchical  architecture  which  supports  simultaneous  tightly 
and  loosely  coupled  systems  is  attractive. 
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c.  A  multiple  star,  multiple  cluster  architecture 
using  commercially  developed  16  bit  microcomputers  was 
developed.   A  complete  star  bus  switch  network  was  developed 
which  is  managed  by  a  control  system  consisting  of  three 
levels  of  control:   random  priority  controller,  distributed 
controller,  central  controller. 

d.  The  basic  concept  of  this  hardware  architecture 
has  been  basically  tested  by  simulated  intercommunications. 
Extensive  tests  in  real  signal/data  processing  environments 
are  awaiting  the  successful  developments  of  operating  systems 

6.   ResultsIII  -  Implementation  of  Adaptive  Spatial 
Filters  on  Microcomputers  and  Multiple 
Microcomputer  Systems 

a.  The  spatial  filter  program  was  coded  for  one 
main  frame,  the  IBM  360-67,  and  two  16  bit  microcomputers: 
the  DEC  LSI-11  and  one  Intel  8612.   The  DEC  LSI-11  has  more 
mature  floating  point  mathematics  software  and  a  hardware 
arithmetic  IC  chip,  but  is  not  as  well  suited  for  multiple 
microcomputer  system  development  as  the  Intel  8612,  whose 
floating  point  software  is  still  very  primitive.   However, 
when  coded  in  assembly  language,  the  Intel  8612  performs 
the  spatial  filtering  faster  than  the  main  frame  coded  in 
high  order  language. 

b.  Implemented  by  using  only  one  16  bit  8612  micro- 
computer, the  computation  times  for  the  3x3  spatial  filter 
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and  a  32x32  image  have  been  measured  as  follows: 

Spatial  statistics  computation     =  0.72  sec. 

Adaptive  spatial  filter  design     =  1.0   sec. 
CConjugate  gradient  Pollack  method) 

Perform  spatial  filtering  =  0.47  sec. 

c.   Several  ways  of  using  the  multiple  microcomputer 
implementation  by  placing  program,  variables  and  data  in  the 
unshared  private  RAM  and/or  the  shared  global  RAM  have  been 
investigated. 

It  was  found  that  the  best  enhancement  of  total 
execution  speed  of  the  spatial  filtering  is  to  use  more  micro 
computers  by  storing  the  program  and  variables  in  the  private 
RAM  and  the  data  in  the  global  RAM.   The  image  data  is  not 
moved  into  the  microcomputer  all  at  once>.»  Instead,  the  data 
is  moved,  one  at  a  time,  into  the  private  RAM  of  the  micro- 
computer only  moments  before  it  is  needed  for  processing. 

B.   RECOMMENDATION 
1.   General 

Both  topics  covered  in  this  thesis  are  quite  new. 
This  research  only  opens  the  gate  a  little  into  two  fields 
worthy  of  more  investigations.   Although  this  thesis  is  con- 
cerned mainly  with  the  image  processing  developments  and 
their  implementations  for  infrared  smart  sensors,  the  tech- 
niques developed  are  generic  and  can  be  applied  to  much 
broader  fields  beyond  smart  sensors. 
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2.  Adaptive  Filters 

The  new  techniques  based  on  the  concepts  of  gradient, 
optimization  search  can  be  applied  to  most  of  the  adaptive 
filter  research  done  in  the  past  using  the  LMS  algorithm. 

For  adaptive  image  processing  applications,  they 
should  be  used  to  develop  adaptive  temporal  filters  if  a 
series  of  successive  frames  of  images  are  rather  well  regis- 
tered spatially  from  frame  to  frame,  although  there  may  be 
drift,  jitter,  rotations,  etc.  between  frames. 

Testing  of  these  adaptive  filters  using  more  challeng- 
ing real  world  images  which  have  serious  non-stationarity 
should  be  performed  to  give  the  adaptive  filtering  techniques 
some  tough  challenges.   Jamming  and  interference  noises  should 
be  considered.   The  convergence  time  of  the  compiled  adaptive 
filter  programs  should  be  measured  to  obtain  relative  speed 
of  convergence  of  all  the  adaptation  methods.   Adaptive  fil- 
ters for  extended  targets  should  be  developed. 

3.  Multiple  Microcomputer  System 

Although  the  subject  of  multiple  microcomputer  systems 
is  not  new,  there  are  many  unresolved  questions  that  have 
hardly  been  touched  because  of  the  extensive  effort  required 
to  make  any  type  of  multiple  microcomputer  system  operational. 
Only  two  such  systems  are  known  to  be  working  today,  Cmmp  and 
Cm*,  although  many  system  architectures  have  been  proposed 
and  conceptualized,   A  small  number  of  these  have  been  simu- 
lated.  A  smaller  number  of  them  are  being  emulated.   An  even 
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smaller  number  of  them  are  being  built.   Simulations  and 
modeling  used  today  for  multiple  microcomputer  systems  must 
be  carefully  and  critically  scrutinized  for  their  validity 
and  usefulness.   It  is  extremely  important  to  examine  how 
the  intercommunication  overhead  is  modeled  and  simulated. 
There  is  very  little  first-hand  experience  in  existence  today. 
Therefore,  a  wide  variety  of  problems  associated  with 
the  new  multiple  microcomputer  systems  must  be  researched, 
examined  and  answered. 

This  thesis  contributed  to  the  formulation,  design,  fab- 
rication and  test  of  a  multiple  microcomputer  system  which 
can  be  used  - 

1.  Not  only  for  developing  effective  ways  of  implementing 
smart  sensor  image  processing,  in  general,  and  the  adaptive 
image  processing,  in  particular, 

2.  But  also  as  a  test  bed  to  develop,  verify,  and  improve 
several  basic  issues  of  multiple  microcomputer  systems.   In- 
cluded were  considerations  of: 

a.  Effective  and  alternative  intercommunication  for 
combined  tightly  and  loosely  coupled  systems. 

b.  Effective  and  alternative  operating  systems  for 
real  time  signal  processing,  multi-tasking,  multi-users, 
security,  dynamic  reconfiguration  and  fault  tolerance. 

c.  Effective  and  alternative  programming  methodologies 
for  partitioning  a  given  problem  into  a  number  of  modules  suit- 
able for  combined  pipeline  and  parallel  implementation  on 
multiple  microcomputer  systems. 
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d.  Effective  and  alternative  ways  of  using  the  dis 
tributed  capabilities  of  multiple  microcomputer  systems  for 
fault  tolerance,  self -maintenance  error  recovery. 
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